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(57) Abstract 

A system and method is disclosed that compresses and decompresses images. The compression system and method (126, 128. 
130 132) includes an encoder (130) which compresses images and stores such compressed images in a unique file format, and a decoder 
(1 10) which decompresses images. The encoder (130) optimizes the encoding process to accommodate different mage types with fuzzy 
logic methods (152) that automatically analyze and decompose a source image, classify its components, select the optimal compression 
method for each component, and determine the optimal parameters of the selected compression methods. The encoding methods mlcude: 
a Reed Spline Filter (138), a discrete cosine transform (136), a differential pulse code modulator (140), and enhancement analyzer (144), 
an adaptive vector quantizer (134) and a channel encoder (132) to generate a plurality of data segments that contain the compressed image. 
The plurality of data segments are layered in the compressed file (104) to optimize the decoding process. The first layer allows the decoder 
(110) to display the compressed image as a miniature or a coarse quality full sized image, the decoder (110) then adds additional detail 
and sharpness to the displayed image as each new layer is received. The decoder (110) uses optimal decompression methods to expand the 
compressed image file. 
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METHOD AND APPARATUS FOR COMPRESSING IMAGES 

FfflTfrTT™""* °* the invention 
vjf}A of the Invention 

This invention relates to the compression and 
5 decompression of digital data and, more particularly, to the 
reduction in the amount of digital data necessary to store 
and transmit images. 
Background of th e Invention 

Image compression systems are commonly used in computers 
10 to reduce the storage space and transmittal times associated 
with storing, transferring and retrieving images. Due to 
increased use of images in computer applications, and the 
increase in the transfer of images, a variety of image 
compression techniques have attempted to solve the problems 
15 associated with the large amounts of storage space (i.e., 

hard disks, tapes or other devices) needed to store images. 

Conventional devices store an image as a two-dimensional 
array of picture elements, or pixels. The number of pixels 
determines the resolution of an image. Typically the 
resolution is measured by stating the number of horizontal 
and vertical pixels contained in the two dimensional image 
array. For example, a 640 by 480 image has 640 pixels across 
and 480 from top to bottom to total 307,200 pixels. 

While the number of pixels represents the image 
resolution, the number of bits assigned to each pixel 
represents the number of available intensity levels of each 
pixel. For example, if a pixel is only assigned one bit, the 
pixel can represent a maximum of two values. Thus the range 
of colors which can be assigned to that pixel is limited to 
two (typically black and white) . In color images, the bits 
assigned to each pixel represent the intensity values of the 
three primary colors of red, green and blue. In present 
"true color" applications, each pixel is normally represented 
by 24 bits where 8 bits are assigned to each primary color 
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allowing the encoding of IS. 8 million (2 8 x 2 a x 2 e ) different 
colors . 

Consequently, color images require large amounts of 
storage capacity. For example, a typical color (24 bits per 
pixel) image with a resolution of 640 by 480 requires 
approximately 922,000 bytes of storage. A larger 24 -bit 
color image with a 2000 by 2000 pixel resolution requires 
approximately twelve million bytes of storage. As a result, 
image -based applications such as interactive shopping, 
multimedia products, electronic games and other image -based 
presentations require large amounts of storage space to 
display high quality color images. 

In order to reduce storage requirements, an image is 
compressed (encoded) and stored as a smaller file which 
requires less storage space. In order to retrieve and view 
the compressed image, the compressed image file is expanded 
(decoded) to its original size. The decoded (or 

"reconstructed") image is usually an imperfect or "lossy" 
representation of the original image because some information 
may be lost in the compression process. Normally, the 
greater the amount of compression the greater the divergence 
between the original image and the reconstructed image. The 
amount of compression is often referred to as the compression 
ratio. The compression ratio is the amount of storage space 
needed to store the original (uncompressed) digitized image 
file divided by the amount of storage space needed to store 
the corresponding compressed image file. 

By reducing the amount of storage space needed to store 
an image, compression is also used to reduce the time needed 
to transfer and communicate images to other locations. In 
order to transfer an image, the data bits that represent the 
image are sent via a data channel to another location. The 
sequence of transmitted bytes is called the data stream. 
Generally, the image data is encoded and the compressed image 
data stream is sent over a data channel and when received, 
the compressed image data is decoded to recreate the original 
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image. Thus, compression speeds the transmission of image 
files by reducing their size. 

Several processes have been developed for compressing 
the data required to represent an image- Generally, the 
processes rely on two methods: 1) spatial or time domain 
compression, and 2) frequency domain compression. In 
frequency domain compression, the binary data representing 
each pixel in the space or time domain are mapped into a new 
coordinate system in the frequency domain. 

In general, the mathematical transforms, such as the 
discrete cosine transform (DCT) , are chosen so that the 
signal energy of the original image is preserved, but the 
energy is concentrated in a relatively few transform 
coefficients. Once transformed, the data is compressed by 
15 quantization and encoding of the transform coefficients. 

Optimization of the process of compressing an image 
includes increasing the compression ratio while maintaining 
the quality of the original image, reducing the time to 
encode an image, and reducing the time to decode a compressed 
20 image. In general, a process that increases the compression 

ratio or decreases the time to compress an image results in 
a loss of image' quality. A process that increases the 
compression ratio and maintains a high quality image often 
results in longer encoding and decoding times. Accordingly, 
25 it would be advantageous to increase the compression ratio 

and reduce the time needed to encode and decode an image 
while maintaining a high quality image. 

It is well known that image encoders can be optimized 
for specific image types. For example, different types of 
3 0 images may include graphical, photographic, or typographic 

information or combinations thereof. As discussed in more 
detail below, the encoding of an image can be viewed as a 
multi-step process that uses a variety of compression methods 
which include filters, mathematical transformations, 
3 5 quantization techniques, etc. In general each compression 

method will compress different image types with varying 
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comparative efficiency. These compression methods can be 
selectively applied to optimize an encoder with respect to a 
certain type of image. In addition to selectively applying 
various compression methods, it is also possible to optimize 
.5 an encoder by varying the parameters (e.g., quantization 

tables) of a particular compression method. 

Broadly speaking, however, the prior art does not 
provide an adaptive encoder that automatically decomposes a 
source image, classifies its parts, and selects the optimal 
10 compression methods and the optimal parameters of the 

selected compression methods resulting in an optimized 
encoder that increases relative compression rates. 

Once an image is optimally compressed with an encoder, 
the set of compressed data are stored in a file. The 
15 structure of the compressed file is referred to as the file 

format. The file format can be fairly simple and common, or 
the format can be quite complex and include a particular 
sequence of compressed data or various types of control 
instructions and codes. 
20 The file format (the structure of the data in the file) 

is especially important when compressed data in the file will 
be read and processed sequentially and when the user desires 
to view or transmit only part of a compressed image file. 
Accordingly, it would be advantageous to provide a file 
25 format that "layers" the compressed image components, 

arranging those of greatest visual importance first, those of 
secondary visual importance second, and so on. Layering the 
compressed file format in such a way allows the first segment 
of the compressed image file to be decoded prior to the 
3 0 remainder of the file being received or read by the decoder. 

The decoder can display the first segment (layer) as a 
miniature version of the entire image or can enlarge the 
miniature to display a coarse or "splash" quality rendition 
of the original image. As each successive file segment or 
35 layer is received, the decoder enhances the quality of the 
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displayed picture by selectively adding detail and correcting 
pixel values. 

Like the encoding process, the decoding of an image can 
be viewed as a multi-step process that uses a variety of 
decoding methods which include inverse mathematical 
transformations, inverse quantization techniques, etc. 
Conventional decoders are designed to have an inverse 
function relative to the encoding system. These inverse 
decoding methods must match the encoding process used to 
encode the image. In addition, where an encoder makes 
content -sensitive adaptations to the compression algorithm, 
the decoder must apply a matching content-sensitive decoding 
process . 

Generally, a decoder is designed to match a specific 
encoding process. Prior art compression systems exist that 
allow the decoder to adjust particular parameters, but the 
prior art encoders must also transmit accompanying tables and 
other information. In addition, many conventional decoders 
are limited to specific decoding methods that do not 
accommodate content -sensitive adaptations. 

Summary of the Invention 
The problems outlined above are solved by the method and 
apparatus of the present invention. That is, the computer- 
based image compression system of the present invention 
includes a unique encoder which compresses images and a 
unique decoder which decompresses images. The unique 
compression system obtains high compression ratios at all 
image quality levels while achieving relatively quick 
encoding and decoding times. 

A high compression ratio enables faster image 
transmission and reduces the amount of storage space required 
to store an image. When compared with conventional 
compression techniques, such as the Joint Photographic 
Experts Group (JPEG) , the present invention significantly 
increases the compression ratio for color images which, when 
decompressed, are of comparable quality to the JPEG images. 
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The exact improvement over JPEG will depend on image content, 
resolution, and other factors. 

Smaller image files translate into direct storage and 
transmission time savings. In addition, the present 
5 invention reduces the number of operations to encode and 
decode an image when compared to JPEG and other compression 
methods of a similar nature. Reducing the number of 
operations reduces the amount of time and computing resources 
needed to encode and decode an image, and thus improves 
10 computer system response times. 

Furthermore, the image compression system of the present 
invention optimizes the encoding process to accommodate 
different image types. As explained below, the present 
invention uses fuzzy logic techniques to automatically 
15 analyze and decompose a source image, classify its 

components, select the optimal compression method for each 
component, and determine the optimal content -sensitive 
parameters of the selected compression methods. The encoder 
does not need prior information regarding the type of image 
or information regarding which compression methods to apply. 
Thus, a user does not need to provide compression system 
customization or need to set the parameters of the 
compression methods. 

The present invention is designed with the goal of 
25 providing an image compression system that reliably 

compresses any type of image with the highest achievable 
efficiency, while maintaining a consistent range of viewing 
qualities. Automating the system's adaptivity to varied 
image types allows for a minimum of human intervention in the 
3 0 encoding process and results in a system where the 

compression and decompression process are virtually 
transparent to the users. 

The encoder and decoder of the present invention contain 
a library of encoding methods that are treated as a 
"toolbox." The toolbox allows the encoder to selectively 
apply particular encoding methods or tools that optimize the 
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compression ratio for a particular image component. The 
toolbox approach allows the encoder to support many different 
encoding methods in one program, and accommodates the 
invention of new encoding methods without invalidating 
5 existing decoders. The toolbox approach thus allows 

upgradeability for future improvements in compression methods 
and adaptation to new technologies. 

A further feature of the present invention is that the 
encoder creates a file format that segments or "layers" the 
10 compressed image. The layering of the compressed image 

allows the decoder to display image file segments, beginning 
with the data at the front of the file, in a coherent 
sequence which begins with the decoding and display of the 
information that constitutes the core of the image as defined 
15 by human perception. This core information can appear as a 

good quality miniature of the image and/or as a full sized 
"splash" or coarse quality version of the image. Both the 
miniature and splash image enable the user to view the 
essence of an image from a relatively small amount of encoded 
20 data. In applications where the image file is being 

transmitted over a data channel, such as a telephone line or 
limited bandwidth' >wireless channel, display of the miniature 
and/or splash image occurs as soon as the first segment or 
layer of the file is received. This allows users to view the 
25 image quickly and to see detail being added to the image as 

subsequent layers are received, decoded, and added to the 
core image . 

The decoder decompresses the miniature and the full 
sized splash quality image from the same information. User 
30 specified preferences and the application determine whether 

the miniature and/or the full sized splash quality image are 
displayed for any given image. 

Whether the first layer is displayed as a miniature or 
a splash quality full size image, the receipt of each 
3 5 successive layer allows the decoder to add additional image 

detail and sharpness. Information from the previous layer is 
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supplemented, not discarded, so that the image is built layer 
by layer. Thus a single compressed file with a layered file 
format can store both a thumbnail and a full size version of 
the image and can store the full size version at various 
quality levels without storing any redundant information. 

The layered approach of the present invention allows the 
transmission or decoding of only the part of the compressed 
file which is necessary to display a desired image quality. 
Thus, a single compressed file can generate a thumbnail and 
different quality full size images without the need to 
recompress the file to a smaller size and lesser quality, or 
store multiple files compressed to different file sizes and 

quality levels. 

This feature is particularly advantageous for on line 
15 service applications, such as shopping or other applications 
where the user or the application developer may want several 
thumbnail images downloaded and presented before the user 
chooses to receive the entire full size, high quality image. 
In addition to conserving the time and transmission costs 
20 associated with viewing a variety of high quality images that 
may not be of interest, the user need only subsequently 
download the remainder of each image file to view the higher 
detail versions of the image. 

The layered format also allows the storage of different 
25 layers of the compressed data file separate from one another. 

Thus, the core image data (miniature) can be stored locally 
(e.g., in fast RAM memory for fast access), and the higher 
quality "enhancement" layers can be stored remotely in lower 
cost bulk storage. 

A further feature of the layered file format of the 
present invention allows the addition of other compressed 
data information. The layered and segmented file format is 
extendable so that new layers of compressed information such 
as sound, text and video can be added to the compressed image 
3 5 data file. The extendable file format allows the compression 
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system to adapt to new image types and to combine compressed 
image data with sound, text and video. 

Like the encoder, the decoder of the present invention 
includes a toolbox of decoding methods. The decoding process 
can begin with the decoder first determining the encoding 
methods used to encode each data segment. The decoder 
determines the encoding methods from instructions the encoder 
inserts into the compressed data file. 

Adding decoder instructions to the compressed image data 
provides several advantages . A decoder that recognizes the 
instructions can decode files from a variety of different 
encoders, accommodate content-sensitive encoding methods , and 
adjust to user specific needs. The decoder of the present 
invention also skips parts of the data stream that contain 
data that are unnecessary for a given rendition of the image, 
or ignore parts of the data stream that are in an unknown 
format. The ability to ignore unknown formats allows future 
file layers to be added while maintaining compatibility with 
older decoders. 

In a preferred embodiment of the present invention, the 
encoder compresses an image using a first Reed Spline Filter, 
an image classifier, a discrete cosine transform, a second 
and third Reed Spline Filter, a differential pulse code 
modulator, an enhancement analyzer, and an adaptive vector 
quantizer to generate a plurality of data segments that 
contain the compressed image. The plurality of data segments 
are further compressed with a channel encoder. 

The Reed Spline Filter includes a color space conversion 
transform, a decimation step and a least mean squared error 
(LMSE) spline fitting step. The output of the first Reed 
Spline Filter is then analyzed to determine an image type for 
optimal compression. The first Reed Spline Filter outputs 
three components which are analyzed by the image classifier. 
The image classifier uses fuzzy logic techniques to classify 
the image type. Once the image type is determined, the first 
component is separated from the second and third components 
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and further compressed with an optimized discrete cosine 
transform and an adaptive vector quantizer. The second and 
third components are further compressed with a second and 
third Reed Spline Filter, the adaptive vector quantizer, and 
a differential pulse code modulator. 

The enhancement analyzer enhances areas of an image 
determined to be the most visually important, such as text or 
edges. The enhancement analyzer determines the visual 
priority of pixel blocks. The pixel block dimensions 
typically correspond to 16 x 16 pixel blocks in the source 
image. In addition, the enhancement analyzer prioritizes 
each pixel block so that the most important enhancement 
information is placed in the earliest enhancement layers so 
that it can be decoded first. The output of the enhancement 
analyzer is compressed with the adaptive vector quantizer. 

A user may set the encoder to compute a color palette 
optimized to the color image. The color palette is combined 
with the output of the discrete cosine transform, the 
adaptive vector quantizer, the differential pulse code 
modulator, and the enhancement analyzer to create a plurality 
of data segments. The channel encoder then interleaves and 
compresses the plurality of data segments. 

p r ^»f Descr -jr*-- 1 "" of the Drawings 
These and other aspects, advantages, and novel features 
of the invention will become apparent upon reading the 
following detailed description and upon reference to 
accompanying drawings in which: 

FIG. 1 is a block diagram of an image compression system 
that encodes, transfers and decodes an image and includes a 
30 source image, an encoder, a compressed file, a first storage 

device, a data channel, a data stream, a decoder, a display, 
a second storage device, and a printer; 

FIG. 2 illustrates the multi-step decoding process and 
includes the source image, the encoder, the compressed file, 
the data channel, the data stream, the decoder, a thumbnail 
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image, a splash image, a panellized standard image, and the 
final representation of the source image; 

FIG. 3 is a block diagram of the encoder showing the 
four stages of the encoding process; 

FIG. 4 is a block diagram of the encoder showing a first 
Reed Spline Filter, a color space conversion transform, a Y 
miniature, a U miniature, an X miniature, an image 
classifier, an optimized discrete cosine transform, a 
discrete cosine transform residual calculator, an adaptive 
vector quantizer, a second and third Reed Spline Filter, a 
Reed Spline residual calculator, a differential pulse coder 
modulator, an enhancement analyzer, a high resolution 
residual calculator, a palette selector, a plurality of data 
segments and a channel encoder; 

FIG. 5 is a block diagram of the image formatter; 

FIG. 6 is a block diagram of the Reed Spline Filter; 

FIG. 7 is a block diagram of the color space conversion 
transform; 

FIG. 8 is a block diagram of the image classifier; 

FIG. 9 is a block diagram of the optimized discrete 
cosine transform; 

FIG. 10 is i a block diagram of the DCT residual 
calculator; 

FIG. 11 is a block diagram of the adaptive vector 
quantizer; 

FIG. 12 is a block diagram of the second and third Reed 
Spline Filters; 

FIG. 13 is a block diagram of the Reed Spline residual 

calculator; 

FIG. 14 is a block diagram of the differential pulse 
code modulator; 

FIG. 15 is a block diagram of the enhancement analyzer; 

FIG. 16 is a block diagram of the high resolution 
residual calculator; 

FIG. 17 is the block diagram of the palette selector; 

FIG. 18 is the block diagram of the channel encoder; 
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FIG. 19 is a block diagram of the vector quantization 
process; 

FIGs. 20a and 20b show the segmented architecture of the 

data stream; 

FIG. 21 illustrates the normal segment; 

FIG. 22a, 22b, 22c and 22d illustrate the layering and 
interleaving of the plurality of data segments; 

FIG. 23 is a block diagram of the decoder of the present 

invention; 

FIG. 24 illustrates the multi-step decoding process and 
includes a Ym miniature, a Urn miniature, an Xm miniature, the 
thumbnail miniature, the splash image and the standard image, 
and the enhanced image; 

FIG. 25 is a block diagram of the decoder and includes 
an inverse Huffman encoder, an inverse DPCM, a dequantizer, 
a combiner, an inverse DCT, a demultiplexer, and an adder; 

FIG. 26 is a block diagram of the decoder and includes 
the interpolator, interpolation factors, a scaler, scale 
factors, a replicator, and an inverse color converter; 

FIG. 27 is a block diagram of the decoder that includes 
the inverse Huffman encoder, the combiner, the dequantizer, 
the inverse DCT. a pattern matcher, the adder, the 
interpolator, and an enhancement overlay builder; 

FIG. 28 is block diagram of the scaler with an input to 
output ratio of five-to-three in the one dimensional case; 

FIG. 29 illustrates the process of bilinear 
interpolation ; 

FIG. 30 is a block diagram of the process of optimizxng 
the compression methods with the image classifier, the 
enhancement analyzer, the optimized DCT, the AVQ, and the 

channel encoder; 

FIG. 31 is a block diagram of the image classifier; 

FIG. 32 is a flow chart of the process of creating an 
adaptive uniform DCT quantization table; 
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FIG. 3 3 illustrates a table of several examples showing 
the mapping from input measurements to input sets to output 
sets; 

FIG. 34 is a block diagram of image data compression; 
5 FIG. 35 is a block diagram of a spline 

decimation/interpolation filter; 

FIG. 36 is a block diagram of an optimal spline filter; 
FIG. 37 is a vector representation of the image, 
processed image, and residual image; 
10 FIG . 38 is a block diagram showing a basic optimization 

block of the present invention; 

FIG. 3 9 is a graphical illustration of a one -dimensional 
bi-linear spline projection; 

FIG. 40 is a schematic view showing periodic replication 
15 of a two-dimensional image; 

FIGs. 41a, 41b and 41c are perspective and plan views of 
a two-dimensional planar spline basis; 

FIG. 42 is a diagram showing representations of the 
hexagonal tent function; 
20 FIG. 4 3 is a flow diagram of compression and 

reconstruction of image data; 

FIG. 44 is a graphical representation of a normalized 
frequency response of a one-dimensional bi-linear spline 
basis; 

25 FIG. 45 is a graphical representation of a one- 

dimensional eigenfilter frequency response; 

FIG. 46 is a perspective view of a two-dimensional 
eigenfilter frequency response ; 

FIG. 47 is a plot of standard error as a function of 
30 frequency for a one-dimensional cosinusoidal image; 

FIG. 48 is a plot of original and reconstructed one- 
dimensional images and a plot of standard error; 

FIG. 49 is a first two-dimensional image reconstruction 
for different compression factors; 
35 FIG. 50 is a second two-dimensional image reconstruction 

for different compression factors? 
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FIG. 51 is plots of standard error for representative 

images 1 and 2; ■ ^ 

FIG. 52 is a compressed two- miniature using the 

optimized decomposition weights; 

FIG 53 is a block diagram of a preferred adaptive 
compression scheme in which the method of the present 
invention is particularly suited; 

FIG. 54 is a block diagram showing a combined sublevel 
and optimal-spline compression arrangement; 
1Q FIG. 55 is a block diagram showing a combined sublevel 

and optimal-spline reconstruction arrangement; 

FIG. 55 is a block diagram showing a mult i -resolution 
optimized interpolation arrangement; and 

FIG. 57 is a block diagram showing an embodiment of the 
15 optimizing process in the image domain. 

n^ai ^d Desc rr Hnn Qf fh<a Tnvent;i.on 
FIG 1 illustrates a block diagram of an image 
compression system that includes a source image 100, an 
20 encoder 102, a compressed file 104, a first storage device 

106 a communication data channel 108, a decoder 110, a 
display 112, a second storage device 114, and a printer 116 
The source image 100 is represented as a two-dimensional 
image array of picture elements, or pixels. The number of 
25 pixels determines the resolution of the source image 100 

which is typically measured by the number of horizontal and 
vertical pixels contained in the two-dimensional image array. 

Each pixel is assigned a number of bits that represent 
the intensity level of the three primary colors: red, green, 
and blue. In the preferred embodiment, the full-color source 
image 100 is represented with 24 bits, where 8 bits are 
assigned to each primary color. Thus, the total storage 
required for an uncompressed image is computed as the number 
of pixels in the image times the number of bits used to 
represent each pixel (referred to as bits per pixel) . 
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As discussed in more detail below, the encoder 102 uses 
decimation, filtering, mathematical transforms, and 
quantization techniques to concentrate the image into fewer 
data samples representing the image with fewer bits per pixel 
5 than the original format. Once the source image 100 is 

compressed with the encoder 102, the set of compressed data 
are assembled in the compressed file 104. The compressed 
file 104 is stored in the first storage device 106 or 
transmitted to another location via the data channel 108. If 
10 the compressed file 104 is transmitted to another location, 

the data stored in the compressed file 104 is transmitted 
sequentially via the data channel 108. The sequence of bits 
in the compressed file 104 that are transmitted via the data 
channel 108 is referred to as a data stream 118. 
15 The decoder 110 expands the compressed file 104 to the 

original source image size. During the process of decoding 
the compressed file 104, the decoder 110 displays the 
expanded source image 100 on the display 112. In addition, 
the decoder lib may store the expanded compressed file 104 in 
2 0 the second storage device 114 or print the expanded 

compressed file 1.04 on the printer 116 . 

For example, if the source image 100 comprises a 
64 0 x 480, 24 -bit color image, the amount of memory needed to 
store and display the source image 100 is approximately 
25 922,000 bytes. In the preferred embodiment, the encoder 102 

computes the highest compression ratio for a given decoding 
quality and playback model. The playback model allows a user 
to select the decoding mode as is discussed in more detail 
below. The compressed data are then assembled in the 
30 compressed file 104 for transmittal via the data channel 108 

or stored in the first storage device 106. For example, at 
a 92-to-l compression ratio, the 922,000 bytes that represent 
the source image 100 are compressed into approximately 10,000 
bytes. In addition, the encoder 102 arranges the compressed 
3 5 data into layers in the compressed file 104. 
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Referring to FIG. 2, it can be aeen that the layering of 
the compressed file 104 allows the decoder 110 to display a 
thumbnail image and progressively improving quality versions 
of the source image 100 before the decoder 110 receives the 
entire compressed file 104. The first data expanded by the 
decoder 110 can be viewed as a thumbnail miniature 120 of the 
original image or as a coarse quality -splash" image 122 with 
the same dimensions as the original image. The splash image 
122 is a result of interpolating the thumbnail miniature to 
the dimensions of the original image. As the decoder 110 
continues to receive data from the data stream 118, the 
decoder 110 creates a standard image 124 by decoding the 
second layer of information and adding it to the splash image 
122 data to create a higher quality image. The encoder 102 
can create a user-specified number of layers in which each 
layer is decoded and added to the displayed image as data is 
received. Upon receiving the entire compressed file 104 via 
the data stream 118, the decoder 110 displays an enhanced 
image 105 that is the highest quality reconstructed image 
that can be obtained from the compressed data stream 118. 

FIG 3 illustrates a block diagram of the encoder 102 
constructed in accordance with the present invention. The 
encoder 102 compresses the source image 100 in four mam 
stages. In a first stage 126, the source image 100 is 
formatted, processed by a Reed Spline Filter and color 
converted. In a second stage 128, the encoder 102 classifies 
the source image 100 in blocks. In a third stage 130, the 
encoder 102 selectively applies particular encoding methods 
that optimize the compression ratio. Finally, the compressed 
data are interleaved and channel encoded in a fourth stage 
132. 

The encoder 102 contains a library of encoding methods 
that are treated as a toolbox. The toolbox allows the 
encoder 102 to selectively apply particular encoding methods 
that optimize the compression ratio for a particular image 
type, in the preferred embodiment, the encoder 102 includes 
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at least one of the following: an adaptive vector quantizer 
(AVQ 134) , an optimized discrete cosine transform (optimized 
DCT 136) , a Reed Spline Filter 138 (RSF) , a differential 
pulse code modulator (DPCM 140) , a run length encoder (RLE 
5 142), and an enhancement analyzer 144* 

FIG. 4 illustrates a more detailed block diagram of the 
encoder 102. The first stage 126 of the encoder 102 includes 
a formatter 146, a first Reed Spline Filter 148 and a color 
space converter 150 which produces Y data 186, and U and X 
10 data 188. The second stage 128 includes an image classifier 

152. The third stage includes an optimized discrete cosine 
transform and adaptive DCT quantization (optimized DCT 136), 
a DCT residual calculator 154, the adaptive vector quantizer 
(AVQ 134), a second and a third Reed Spline Filter 156, a 
15 Reed Spline residual calculator 158, the differential pulse 

code modulator (DPCM 140), a resource file 160, the 
enhancement analyzer 144, a high resolution residual 
calculator 162, and a palette selector 164. The fourth stage 
includes a plurality of data segments 166 and a channel 
20 encoder 168. The output of the channel encoder 168 is stored 

in the compressed file 104 . 

The formatter 146, as. shown in more detail in FIG. 5, 
converts the source image 100 from its native format to a 24- 
bit red, green and blue pixel array. For example, if the 
25 source image 100 is an 8 -bit palletized image, the. formatter 

converts the 8-bit palletized image to a 24-bit red, green, 
and blue equivalent . 

The first Reed Spline Filter 148, illustrated in more 
detail in FIG. 6, uses a two-step process to compress the 
30 formatted source image 100. The two-step process comprises 

a decimation step performed in block 170 and a spline fitting 
step performed in a block 172. As explained in more detail 
below, the decimation step in the block 170 decimates each 
color component of red, green, and blue by a factor of two 
35 along the vertical and horizontal dimensions using a Reed 

Spline decimation kernal . The decimation factor is called 
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« t au - The R_tau2' decimated data 174 corresponds to the red 
component decimated by a factor of 2. The G_tau2< decimated 
data 176 corresponds to the green component decimated by a 
factor of 2. The B_tau2' decimated data 178 corresponds to 
5 the blue component decimated by a factor of 2. 

in the spline fitting step in block 172, the first Reed 
Spline Filter 148 partially restores the source image detail 
lo.t by the decimation in bloc* 170. The spline fitting step 
in block 172 processes the R_tau2' decimated data 172, the 
10 G tau2< decimated data, and the B_tau2' decimated data to 
calculate optimal reconstruction weights. 

As explained in more detail below, the decoder 110 wxll 
interpolate the decimated data into a full sized image. In 
this interpolation, the decoder 110 uses the reconstruction 
15 weights which have been calculated by the Reed Spline Filter 
in such a way as to minimize the mean squared error between 
the original image components and the interpolated image 
components. Accordingly the Reed Spline Filter 148 causes 
the. interpolated image to match the original image more 
20 closely and increases the overall sharpness of the 
' interpolated picture. In addition, reducing the error 
arising from the decimation step in block 170 reduces the 
amount of data needed to represent the residual xmage. The 
residual image is the difference between the reconstructed 
25 image and the original image. 

The reconstruction weights output from the Reed Spline 
Filter 148 form a "miniature- of the original source image 
100 for each primary color of red, green, and blue, wherexn 
each red, green, and blue miniature is one-quarter the 
resolution of the original source image 100 when a tau of 2 

is used. . ^. 

More specifically, the preferred color space converter 
ISO transforms the R_tau2 miniature 180, the G_tau2 miniature 
182 and the B_tau2 miniature 184 output by the first Reed 
35 Spline Filter 148 into a different color coordinate system in 

which one component is the luminance Y data 186 and the other 
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two components are related to the chrominance U and X data 
188. The color space converter 150 transforms the RGB to the 
YUX color space according to the following formulas: 

Y « 0.29900R + 0.58700G + 0.11400B 

U * 0.16870R + 0.33120G + 0.50000B 

X - 0.50000R - 1.08216G + 0.91869B 
Referring to FIG. 6, it can be seen that a R_tau2 
miniature 180 corresponds to a miniature that is decimated 
and spline fitted by a factor of 2. A G_tau2 * miniature 182 
corresponds to a green miniature that is decimated and spline 
fitted by a factor of 2. A B_tau2 miniature 184 corresponds 
to a blue miniature that is decimated and spline fitted by a 
factor of 2 . 

FIG. 7 illustrates the color space converter 150 of FIG. 
4- The color space converter 150 transforms the R_tau2 
miniature 180, the G_tau2 miniature 182 and the B_tau2 
miniature 184 output by the first Reed Spline Filter 148 into 
a different color coordinate system in which one component is 
the luminance Y data 186 and the other two components are 
related to the chrominance U and X data 18 8 as shown in FIG. 
4. Thus the color space converter 150 transforms the R_tau2 
miniature 180, the G_tau2 miniature 182 and the B_tau2 
miniature 184 into a Y_tau2 miniature 190, a U_tau2 miniature 
192 and an X_tau2 miniature 194. 

Referring to FIG. 8, it can be seen that the second 
stage 12 8 of the encoder 102 includes an image classifier 152 
that determines the image type by analyzing the Y_tau2 
miniature 190, the U_tau2 miniature 192 and the X_tau2 
miniature 194. The image classifier 152 uses a fuzzy logic 
rule base to classify an image into one or more of its known 
classes. In the preferred embodiment, these classes include 
gray scale, graphics, text, photographs, high activity and 
low activity images. The image classifier 152 also 
decomposes the source image 100 into block units and 
classifies each block. Since the source image 100 includes 
a combination of different image types, the image classifier 
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152 sub-divides the source image 100 into distinct regions. 
The image classifier 152 then outputs the control script 196 
that specifies the correct compression methods for each 
region. The control script 196 specifies which compression 
methods to apply in the third stage 130, and specifies the 
channel encoding methods to apply in the fourth stage 132. 

As shown in FIG. 4, during the third stage 130, the 
encoder 102 uses the control script 196 to select the optimal 
compression methods from its compression, toolbox. The 
encoder 102 separates the Y data 186 from the U and X data 
188. Thus, the encoder 102 separates the Y_tau2 miniature 
190 from the U_tau2 miniature 192 and the X_tau2 miniature 
194, and passes the Y_tau2 miniature 190 to the optimized DCT 
13s' and passes the U_tau2 miniature 192 and the X_tau2 
miniature 194 to a second and third Reed Spline Filter 156. 

As illustrated in FIG. 9, the optimized DCT 136 
subdivides the Y_tau2 miniature 190 into a set of 8 x 8 pixel 
blocks and transforms each 8 x 8 pixel block into sixty-four 
DCT coefficients 198. The DCT coefficients include the AC 
terms 200 and the DC terms 201. The DCT coefficients 198 are 
analyzed by the optimized DCT 136 to determine optimal 
quantization step sizes and reconstruction values. The 
optimized DCT 136 stores the optimal quantization step sizes 
(uniform or non-uniform) in a quantization table Q 202 and 
outputs the reconstruction values to the CS data segment 204. 
The optimized DCT 136 then quantizes the DCT coefficients 198 
according to the quantization table Q 202. Once quantized, 
the optimized DCT 136 outputs the "OCT quantized values 206 to 
the DCT data segment 208. 

in order to preserve the image information lost by the 
optimized DCT 136, the DCT residual calculator 154 (shown in 
FIG. 10) computes and compresses the DCT residual. The DCT 
residual calculator 154 dequantizes in a dequantizer 209 the 
DCT quantized values 206 stored in the DCT data segment 208 
by multiplying the reconstruction values in the CS data 
segment 204 with the DCT quantized values 206. The DCT 
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residual calculator 154 then reconstructs the dequantized DCT 
components with an inverse DCT 210 to generate a 
reconstructed dY_tau2 miniature 211. The reconstructed 
dY_tau2 miniature 211 is subtracted from the original Y_tau2 
miniature 190 to create an rY_tau2 residual 212. 

Referring to FIG. 11/ it can be seen that the rY_tau2 
residual 212 is further compressed with the AVQ 134. The 
technique of vector quantization is used to represent a block 
of information as a single index that requires fewer bits of 
storage. As explained in more detail below, the AVQ 134 
maintains a group of commonly occurring block patterns in a 
set of codebooks 214 stored in the resource file 160. The 
index references a particular block pattern within a 
particular codebook 214. The AVQ 134 compares the input 
block with the block patterns in the set of codebooks 214. 
If a block pattern in the set of codebooks 214 matches or 
closely approximates the input block, the AVQ 134 replaces 
the input block pattern with the index. 

Thus, the AVQ 134 compresses the input block information 
into a list of indexes. The indexes are decompressed by 
replacing each index with the block pattern each index 
references in the set of codebooks 214. The decoder 110, as 
explained in more detail below, also has a set of the 
codebooks 214 . During the decoding process the decoder 110 
uses the list of indexes to reference block patterns stored 
in a particular codebook 214. The original source cannot be 
precisely recovered from the compressed representation since 
the indexed patterns in the codebook will not match the input 
block exactly. The degree of loss will depend on how well 
the codebook matches the input block. 

As shown in FIG. 11, the AVQ 134 compresses the rY_tau2 
residual 212, by sub-dividing the rY_tau2 residual 212 into 
4x4 residual blocks and comparing the residual blocks with 
codebook patterns as explained above. The AVQ 134 replaces 
the residual blocks with the codebook indexes that minimize 
the squared error. The AVQ 134 outputs the list of codebook 
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referred to as a dU_tau4 miniature 234. The interpolated 
X_taul6 miniature 232 is referred to as a dX_tau4 miniature 
236, The dU__tau4 miniature 234 and dX_tau4 miniature 236 are 
subtracted from the actual U_tau4 miniature 226 and X_tau4 
miniature 228 to create an rU_tau4 residual 238 and an 
rX_tau4 residual 240. 

As illustrated in FIG, 11, the rU_tau4 residual 238 and 
the rX_tau4 residual 240 are further compressed with the AVQ 
134, The AVQ 134 subdivides the rU_tau4 residual 238 and the 
rX_tau4 residual 240 into 4x4 residual blocks. The 
residual blocks are compared with blocks in the set of 
codebooks 214 to find the codebook patterns that minimize the 
squared error. The AVQ 134 compresses the residual block by 
assigning an index that identifies the corresponding block 
pattern in the set of codebooks 214. Once complete, the AVQ 
134 outputs the compressed residual as the VQ3 data segment 
242 and the VQ4 data segment 244. 

The U_taul6 miniature 230 and the X_ taul6 miniature 232 
are also compressed with the DPCM 14 0 as shown in FIG. 14. 
The DPCM 14 0 outputs the low-detail color components as the 
URCA data segment 246 and the XRCA data segment 248. The 
URCA data segment '246 and the XRCA data segment 248 form the 
low-detail color components that the decoder 110 uses to 
create the color thumbnail miniature 120 if this is included 
as a playback option in the compressed data stream 118. 

FIG. 15 illustrates the enhancement analyzer 144 of the 
preferred embodiment. The Y_tau2 miniature 190, the U_tau4 
miniature 226, and the X_tau4 miniature 228 are analyzed to 
determine an enhancement list 250 that specifies the visual 
priority of every 16 x 16 image block. The enhancement 
analyzer 144 determines the visual priority of each 16 x 16 
image block by convolving the Y_tau2 miniature 190, the 
U_tau4 miniature 226, and the X_tau4 miniature 228 and 
comparing the result of the convolution to a threshold value 
E 252. The threshold value E 252 is user defined. The user 
can set the threshold value E 252 from zero to 200. The 
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threshold value E 252 determines how much enhancement 
information the encoder 102 adds to the compressed file 104. 
Thus, setting the threshold value E 252 to zero will suppress 
any image enhancement information. 

If the result of convolving a particular 16 x 16 high 
resolution block is greater than the threshold value E 252, 
the 16 x 16 high-resolution block is prioritized and added to 
the enhancement list 250. Thus the enhancement list 250 
identifies which 16 x 16 blocks are coded and prioritizes how 
the 16 x 16 coded blocks are listed. 

The high resolution residual calculator 162, as shown in 
FIG. 16, determines the high resolution residual for each 
16 x 16 high resolution block identified in the enhancement 
list 250. The high resolution residual calculator 162 
translates the VQl data segment 224 from the AVQ 134 into a 
reconstructed rY_tau2 residual 212 by mapping the indexes in 
the VQl data segment 224 to the patterns in the codebook. 
The reconstructed rY_tau2 residual is added to the dY_tau2 
miniature 254 (dequantized DCT components) . The result is 
interpolated by a factor of two in the vertical and 
horizontal dimensions and is subtracted from the original 
Y_tau2 190 miniature to form the high resolution residual. 

The high resolution residual calculator 162 then 
extracts high resolution 16 x 16 blocks from the high 
resolution residual according to the priorities m the 
enhancement list 250. As will be explained in more detail 
below, the high resolution residual calculator 162 outputs 
the highest priority blocks in the first enhancement layer, 
the next-highest priority blocks in the second enhancement 
layer, etc. The high resolution residual blocks are referred 
to as the xr_Y residual 256. 

The xr_Y residual 256 is further compressed with the AVQ 
134 The AVQ 134 subdivides the xr_Y residual 256 into 4 x 4 
residual blocks. The residual blocks are compared with 
blocks in the codebook. If a residual block corresponds to 
a block pattern in the codebook, the AVQ 134 compresses the 
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4x4 residual block by assigning an index that identifies 
the corresponding block pattern in the codebook. Once 
complete, the AVQ 134 outputs the compressed high resolution 
residual to the VQ2 data segment 258. 
5 FIG. 17 illustrates a block diagram of the palette 

selector 164. The palette selector 164 computes a "best-fit" 
24 -bit color palette 260 for the decoder 110. The palette 
selector 164 is optional and is user defined. The palette 
selector 164 computes the color palette 260 from the Y_tau2 
10 miniature 190, the U_tau2 miniature 192 and the X_tau2 

miniature 194. The user can select a number of palette 
entries N 262 to range from 0 to 255 entries. If the user 
selects a zero, no palette is computed. If enabled, the 
palette selector 164 adds the color palette 260 to a 
15 plurality of data segments 166. 

The channel encoder 168, as shown in FIG. 18, 
interleaves and channel encodes the plurality of data 
segments 166. Based on the user defined playback model 261, 
the plurality of data segments 166 are interleaved as 
20 follows: 1) as a single layer, single-pass comprising the 

entire image, 2) as two layers comprising the thumbnail 
miniature 120 and the remainder of the image 122 with 
enhancement information interleaved into each data block 
(panel) in the second layer, and 3) as multiple layers 
25 comprising the thumbnail miniature 120, the standard image 

124, the sharp image 105, and additional layers as specified 
by the user. For each playback model an option exists to 
interleave the data for panellized or non-panellized display. 
The user defined playback model 2 61 is described in more 
3 0 detail below. 

After interleaving the plurality of data segments 166, 
the channel encoder 168 compresses the plurality of data 
segments 166 in response to the control script 196. In the 
preferred embodiment, the channel encoder 168 compresses the 
35 plurality of data segments 166 with: 1) a Huffman encoding 

process that uses fixed tables, 2) a Huffman process that 
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uses adaptive tables, 3) a conventional LZ1 coding technique 
or 4) a run-length encoding process. The channel encoder 168 
chooses the optimal compression method based on the image 
type identified in the control script 196. 

5 ^ r ^T tri enTodiment of the AVQ 134 i. illustrated 
in FIG. IS. "ore specifically, the AVQ 134 optimizes the 
vector quantization techniques described above. The AVQ 134 
sub-divides the image data into a set of 4 x 4 pixel blocks 

10 2Xi. The 4,4 pixel blocxs 216 include sateen (1« 
elements X„X„X,...X 1< 216, that start at the upper 
corner and move left to right on every row to the bottom 

right-hand corner. 

The codebook 214 of the present invention comprises M 

15 predetermined sixteen-element vectors, P^.F, P. 220 

that correspond to common patterns found in the population of 

, - „ T t t I„ 222 refer respectively to 

images. The indexes I 1# I 2 , Ij. • • • » a m 

the patterns Px.Pj.Pj. -' p " 220 ' 

Finding a best-fit pattern from the codebook 214 
requires comparing each input block with every pattern m the 
codebook 214 and selecting the index that corresponds to the 
pattern with the minimum squared error summed over the 16 
elements in the 4 x 4 block. The optimal code, C for an 
input vector. X. is the index j such that pattern P, 
25 satisfies: 



20 



30 
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where- X, is the ith element of the input vector, X 
and P ik is the ith element of the VQ pattern 



The 



_ comparison equation finds the best match by 
selecting the minimum error term that results from comparing 
the input block with the codebook patterns. In other words 
the AVQ 134 calculates the mean squared error term associated 
with each pattern in the codebook 214 in order to determine 
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which pattern in the codebook 214 has the minimum squared 
error (also referred to as the minimum error) . The error 
term is the mean square error produced by subtracting the 
pattern element from the input block element X it squaring 
5 the result and dividing by sixteen (16) . 

The process of searching for a matching pattern in the 
codebook 214 is time-consuming. The AVQ 134 of the preferred 
embodiment accelerates the pattern matching process with a 
variety of techniques. 
10 First, in order to find the optimal codebook pattern, 

the AVQ 134 compares each input block term X t to the 
corresponding tefm in the codebook pattern Pj being tested 
and calculates the total squared error for the first codebook 
pattern. This value is stored as the initial minimum error. 
15 For each of the other patterns P 3 =* P 2 , P 3 , . - - , P M * the AVQ 134 

subtracts the X x and P xj terms and squares the result. The 
AVQ 134 compares the resulting squared error to the minimum 
error. If the squared error value is less than the minimum 
error, the AVQ 134 continues with the next input term X 2 and 
20 computes the squared error associated with X 2 and P 2j . The 

AVQ 134 adds the result to the squared error of the first two 
terms. The AVQ 134 then compares the accumulated squared 
error for X x and X 2 to the minimum error. If the accumulated 
squared error is less than the minimum error the squared 
25 error calculation continues until the AVQ 134 has evaluated 

all 16 terms. 

If at any time in the comparison, the accumulated 
squared error for the new pattern is greater than the minimum 
squared error, the current pattern is immediately rejected 

30 and the AVQ 134 discontinues calculating the squared error 

for the remaining input block terms for that pattern. If the 
total squared error for the new pattern is less than the 
minimum error, the AVQ 134 replaces the minimum error with 
the squared error from the new pattern before making the 

35 comparisons for the remaining patterns. 
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Also if the accumulated squared error for a particular 
codebook pattern is less than a pre-detertained threshold the 
codebook pattern is immediately accepted and the AVQ 134 
quits testing other codebook patterns. Furthermore the 
5 Todebook patterns in the present invention are ordered 
according to the frequency of matches. Thus, the AVQ 134 
begins by comparing the input block with patterns m the 
codebook 214 that are most likely to match. Still further 
the codebook patterns are grouped by the sum of their squared 
10 amplitudes. Thus the AVQ 134 selects a group of similar 
codebook patterns by summing the squared amplitude of an 
input block in order to determine which group of codebook 

* atte zi:^^ «. «- * <*» «* - « - - 

5 find an optimal codebook pattern, the AVQ 134 includes a set 

of codebooks 214 that are adapted to the input blocks (i e , 
codebooks 214 that are optimized for input blocks that 
contain DOT residual values, high resolution residual values 
etc.). Finally, the AVQ 134 of the preferred embodiment, 
20 adapts a codebook 214 to the source image 100 by devising a 
set of new patterns to add to a codebook 214. 

Therefore/ the- AVQ 134 of the preferred embodiment has 
three modes of operation: 1) the AVQ 134 uses a 
codebook 214, 2) the AVQ 134 selects the best-fit codebook 
25 214, or 3) the AVQ 134 uses a combination of existing 
codebooks 214, and new patterns that the AVQ 134 creates. If 
the AVQ 134 creates new patterns, the AVQ 134 stores the new 
patterns in the VQCB data segment 223. 

30 FIGs . 20a and 20b il^strate the segmented architecture 

of the data stream 118 that results from transmitting the 
compressed file 104. The segmented architecture of the 
compressed file 104 in the preferred embodiment allows 
iayering of the compressed image data. Referring to FIG. 2 
35 the layering of the compressed file 104 allows the decoder 
110 to display the thumbnail miniature 120, the splash image 
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122 and the standard image 124 before the entire compressed 
file 104 is transferred. As the decoder 110 receives each 
succiessive layer of components, the decoder 110 adds 
additional detail to the displayed image. 

In addition to layering the compressed data, the 
segmented architecture allows the decoder 110 of the 
preferred embodiment: 1) to move from one segment to the next 
in the stream without fully decoding segments of data, 2) to 
skip parts of the data stream 118 that contain data that is 
unnecessary for a given rendition of the image, 3) to ignore 
parts of the data stream 118 that are in an unknown format, 

4) to process the data in an order that is configurable on 
the fly if the entire data stream 118 is stored locally, and 

5) to store different layers of the compressed file 104 
separately from one another. 

As shown in FIG. 20a, the byte arrangement of the data 
stream 118 and the compressed file 104 includes a header 
segment 400 and a normal segment 402. The header segment 400 
contains header information, and the normal segment 402 
contains data. The header segment 4 00 is the first segment 
in the compressed file 104 and is the first segment 
transmitted with the data stream 118. In the preferred 
embodiment, the header segment 400 is eight bytes long. 

As shown in FIG. 20b, the byte arrangement of the header 
segment 4 00 includes a byte 0 406 and a byte 1 4 08 of the 
header segment 400. Byte 0 406 and byte 1 408 of the header 
segment 400 identify the data stream 118. Byte 1 408 also 
indicates if the data stream 118 contains image data 
(indicated by a "G") or if it contains resource data 
(indicated by a " C n ). Resource data includes color lookup 
tables., font information, and vector quantization tables. 

Byte 2 410, byte 3 412, byte 4 414, byte 5 416, byte 6 
418 and byte 7 420 of the header segment 400 specify which 
encoder 102 created the data stream 118. As new encoding 
methods are added to the encoder 102, new versions of the 
encoder 102 will be sold and distributed to decode the data 
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encoded by the new methods . Thus, to remain compatible with 
prior encoders 102, the decoder 110 needs to identify which 
encoder 102 generated the compressed data. In the preferred 
embodiment, byte 7 420 identifies the encoder 102 and byte 2 
410, byte 3 412, byte 4 414, byte 5 416, and byte 6 418 are 
reserved for future enhancements to the encoder 102. 

FIG. 21 illustrates the normal segment 402 as a sequence 
of bytes that are logically separated into two sections: an 
identifier section 422 and a data section 424. The 
identifier section 422 precedes the data section 424. The 
identifier section 422 specifies the size of the normal 
segment 402, and identifies a segment type. The data section 
424 contains information about the source image 100. 

The identification section 422 is a sequence of one, 
15 two, or three bytes that identifies the length of the normal 

segment 402 and the segment type. The segment type is an 
integer number that specifies the method of data encoding. 
The compressed file 104 contains 256 possible segment types. 
The data in the normal segment 402 is formatted according to 
20 the segment type. In the preferred embodiment, the normal 

segments 402 are optimally formatted for the color palette, 
the Huffman bitstreams, the Huffman tables, the image panels, 
the codebook information, the vector dequantization tables, 
etc. 

25 For example, the file format of the preferred embodiment 

allows the use of different Huffman bitstreams such as an 
8-bit Huffman stream, a 10-bit Huffman stream, and a DOT 
Huffman stream. The encoder 102 uses each Huffman bitstream 
to optimize the compressed file 104 in response to different 

30 image types. The identification section 422 identifies which 

Huffman encoder was used and the normal segment 402 contains 
the compressed data. 

FIGs. 22a, 22b, 22c, and 22d illustrate the layering and 
interleaving of the plurality of data segments 166 in the 

35 compressed file 104 of the preferred embodiment. The 

plurality of data segments 166 in the compressed file 104 are 
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interleaved based on the user defined playback model 261 as 
follows: 1) as a single-pass, non-panellized image (FIG. 
22a) , 2) as a single-pass, panellized image (FIG. 22b), 3) as 
two layers comprising the thumbnail miniature 120, and the 
sharp image 125 (FIG. 22c) and 4) as multiple layers 
comprising the thumbnail miniature 120, the standard image 
124, and the sharp image 125 (FIG. 22d) . 

Block diagram 426 in FIG. 22a shows the compressed file 
format for the single-pass, non-panellized image. The 
compressed file 104 begins with the header, the optional 
color palette and the resource data such as the tables and 
Huffman encoding information. The plurality of data segments 
166 are not interleaved or layered. Thus, the decoder 110 
must receive the entire compressed file 104 before any part 
of the source image 100 can be displayed. 

Block diagram 428 in FIG. 22b shows the compressed file 
104 for the single-pass, panellized image. The plurality of 
data segments 166 are interleaved panel -by-panel , so that all 
of the segments for each panel are contiguously transmitted. 
The decoder 110 can expand and display a panel at a time 
until the entire compressed file 104 is expanded. 

Block diagram 430 in FIG. 22c shows the compressed file 
format of the thumbnail miniature 12 0, the splash image 122 
and the final or sharp image 125. The plurality of data 
segments 166 are interleaved panel -by-panel and the 
resolution components for the thumbnail miniature 120 and 
splash image 122 exist in the first layer, the panels for the 
final image exist in the second layer. The first layer 
includes selected portions of the plurality of data segments 
166 that are needed to decode the panels of the thumbnail 
miniature 120 and splash image 122. Thus, the compressed 
file 104 only stores the low detail color components (URCA 
data segment 246, the XRCA data segment 248), the DC terms 
201 and as many as the first five AC terms 200 in the first 
layer. The number of AC terms 200 depends on the user- 
selected quality of the thumbnail miniature 120. 
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The plurality of data segments 166 in the first layer 
are also interleaved panel -by-panel to allow the thumbnail 
miniature 120 and splash image 122 to be decoded a panel at 
a time. The second layer contains the remaining plurality of 
5 data segments 166 needed to expand the compressed file 104 
into the final image. The plurality of data segments 166 in 
the second layer are also interleaved panel -by-panel . 

Block 432 in FIG- 22d shows the compressed file format 
of the thumbnail image 120, the splash image 122, the layered 

10 standard image 124, and the sharp image 125. The thumbnail 

miniature 120 and splash image 122 are arranged in the first 
layer as described above. The remaining data segments 166 
are layered at different quality levels. The multi- layering 
is accomplished by layering and interleaving panel 

15 information associated with the VQ2 data segment 258 (high 

resolution residual) . The multiple layers allow the display 
of all the panels at a particular level of detail before 
decoding the panels in the next layer. 
The Decoder 

20 FIG. 23 illustrates the decoder 110 of the present 

invention. ... The decoder 110 takes as input the compressed 
data stream 118 and expands or decodes it into an image for 
viewing on the display 112. As explained above, • the 
compressed file 104 and the transmitted data stream 118 
25 include image components that are layered with a plurality of 

panels 433. The decoder 110 expands the plurality of panels 
433 one at a time. 

As illustrated in FIG. 24, the decoder 110 expands the 
compressed file 104 in four steps. In a first step 434, the 
3 0 decoder 110 expands the first layer of image data in the 

compressed file 104 or the data stream 118 into a Ym 
miniature 436, a Urn miniature 438, and an Xm miniature 440. 
In a second step 442, the decoder 110 uses the Ym miniature 
436, the Urn miniature 438, and an Xm miniature 440 to 
35 generate the thumbnail miniature 120, and the splash image 

122. In a third step 444, the decoder 110 receives a second 

-32- 

BNSDOClD:<WO 9602895A1_I_> 



WO 96/02895 



PCI7US95/08827 



layer of image data and generates the higher detail panels 
445 needed to expand the thumbnail miniature 120 into a 
standard image 124, a fourth step 446 the decoder 110 
receives a third layer of image data to generate higher 
detail panels to enhance the detail of the standard image in 
order to create an enhanced image 105 that corresponds to the 
source image 100. 

FIG. 25 illustrates the elements of the first step 434 
in which the decoder 110 expands the AC terms 200, the DC 
terms 201, the URCA data segment 246, and the XRCA data 
segment 248 into the Ym miniature 436, the Urn miniature 438, 
and Xm miniature 440. The first step 434 includes an inverse 
Huffman encoder 458, an inverse DPCM 476, a dequantizer 450, 
a combiner 4 52, an inverse DCT 476, a demultiplexer 454, and 
an adder 456. 

The decoder 110 then separates the DC terms 201 and the 
AC terms 200 from the URCA data segment 246 and the XRCA data 
segment 248. The inverse Huffman encoder 458 decompresses 
the first layer of the data stream 118 which includes the AC 
terms 200, the URCA data segment 24 6, and the XRCA data 
segment 248. The ' inverse DPCM 476 further expands the DC 
terms 201 to output DC terms 201' . The dequantizer 450 
further expands the AC terms 200 to output AC terms 200' by 
multiplying the output AC terms 200' with the quantization 
factors 478 in the quantization table Q 202 to output 8x8 
DCT coefficient blocks 482. The quantization table Q 202 is 
stored in the CS data segment 204 (not shown) . 

The combiner 452 combines the output DC terms 201' with 
the 8x8 DCT coefficient blocks 482. The decoder 110 sets 
the inverse DCT factor 480, and the inverse DCT 476 outputs 
the DCT coefficient blocks 482 that correspond to the Ym 
miniature 436 that is l/256th the size of the original image. 

The demultiplexer 4 54 separates the inverse Huffman 
encoded URCA data segment 246 from the XRCA data segment 248. 
The inverse DPCM 476 then expands the URCA data segment 24 6 
and the XRCA data segment 24 8 to generate the blocks that 

-33- 

<WO 9602895A1 _l_> 



WO 96/02895 



PCT/US95/08827 



10 



15 



20 



25 



30 



35 



correspond to the Urn miniature 438 and the Xm miniature 440. 
The adder 456 translates the blocks corresponding to the Urn 
miniature 438 and the Xm miniature 440 into blocks that 
correspond to a Xm miniature 460. 

FIG. 26 illustrates the second step 442 in which the 
decoder 110 expands the Ym miniature 436, the Urn miniature 
43 8, and the Xm miniature 460 that the decoder 110 further 
includes the interpolator 462 that operates on the Urn 
miniature 436, the Urn miniature 438 and the Xm miniature 460. 
The interpolator 462 is controlled by a Ym interpolation 
factor 484, a Urn interpolation factor 486, and a Xm 
interpolation factor 496. A scaler 466 is controlled by a Ym 
scale factor 490, a Urn scale factor 492, a Xm scale factor 
494. The decoder 110 further includes the replicator 464 and 
the inverse color converter. The interpolator 462 uses a 
linear interpolation process to enlarge the Ym miniature 436, 
the Urn miniature 438, and the Xm miniature 460 by one, two or 
four times in both the horizontal and vertical directions. 

The Ym interpolation factor 484, the Um interpolation 
factor 486, and the Xm interpolation factor 488 control the 
amount of interpolation. The size of the source image 100 in 
the compressed file 104 is fixed, thus the decoder 110 may 
need to enlarge or reduce the expanded image before display. 
The decoder 110 sets the Ym interpolation factor 484 to a 
power of 2 (i.e., l f 2, 4, etc.) in order to optimize the 
decoding process. However, in order to display an expanded 
image at the proper size, the scaler 466 scales the 
interpolated image to accommodate different display formats. 

The interpolator 462 also expands the Um miniature 438 
and the Xm miniature 440, Like the Ym interpolation factor 
484, the decoder 110 sets the Um interpolation factor 486 and 
theXm interpolation factor 496 to a power of two. The 
decoder 110 sets the Ym interpolation factor 484, and the Um 
interpolation factor 486 so that the Um miniature 438 and Xm 
miniature 460 approximate the size of the interpolated and 
scaled Ym miniature 436. 
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After interpolation, the scaler 466 enlarges or reduces 
the interpolated Ym miniature based on the Ym scale factor 
490. In the preferred embodiment, the decoder 110 sets the 
Ym interpolation factor 484 so that the interpolated Ym 
5 miniature 436 is nearly twice the size of the thumbnail 

miniature 120. The decoder 110 then sets the Ym scale factor 
4 90 to reduce the interpolated Ym miniature 436 to the 
display size of the thumbnail miniature 120. The scaler 466 
interpolates the Urn miniature 458 and the Xm miniature 460 
10 with the Urn scale factor 492, and the Xm scale factor 4 94, 

The decoder 110 sets the Xm scale factor 494, the Urn scale 
factor 492, as necessary to scale the image to the display 
size . 

The inverse color converter 468 transforms the 
15 interpolated and scaled miniatures into a red, green, and 

blue pixel array or a palletized image as required by the 
display 112. When converting to a palletized image, the 
inverse color converter 468 also dithers the converted image. 
The decoder 110 displays the interpolated, scaled and color 
20 converted miniatures as the thumbnail miniature 120. 

In order to create the splash image 122, the decoder 110 
expands the interpolated Ym miniature 436, the interpolated 
Urn miniature 43 8 and the interpolated Xm miniature 44 0 with 
a second interpolation process that uses a Ym splash 
25 interpolation factor 498, a Urn splash interpolation factor 

500, and an Xm splash interpolation factor 502. Like the 
thumbnail miniature 120, the decoder 110 also sets the splash 
interpolation factors to a power of two. 

The interpolated data are then expanded with the 
30 replicator 464. The replicator 464 enlarges the interpolated 

data one or two times by replicating the pixel information. 
The replicator 4 64 enlarges the interpolated data based on a 
Ym replication factor 504, a Urn replication factor 506, and 
an Xm replication factor 508. The decoder 110 sets the Ym 
35 replication factor 504, the Urn replication factor 506, and 



-35- 

BNSDOCID: <WO 9602B95A1_L> 



WO 96/02895 ' PCT7US95/08827 



the Xm replication factor 508 so that the replicated image is 
one-fourth of the display size. 

The inverse color converter 468 transforms the 
replicated image data into red, green and blue image data. 
5 The replicator 464 then again replicates the red, green, and 

blue image data to match the display size. The decoder 110 
displays the resulting splash image 122 on the display 112. 

FIG. 27 illustrates the third step 3 in which the 
decoder 110 generates the higher detail panels to expand the 
10 thumbnail miniature 120 into a standard image 124. FIG. 27 

also illustrates the fourth step 446 in which the decoder 110 
generates generate higher detail panels to enhance the detail 
of the standard image in order to create an enhanced image 
105 that corresponds to the source image 100. 
15 The decoding of the standard image 124 and the enhanced 

image 105 requires the inverse Huffman encoder 458, the 
combiner 452, the dequantizer 450, the inverse DCT 476; a 
pattern matcher 524, the adder 456, the interpolator 462, and 
an edge overlay builder 516. The decoder 110 adds additional 
20 detail to the displayed image as the decoder 110 receives new 

layers of compressed data. The additional layers include new 
panels of the DCT data segment 208 (containing the remaining 
AC terms 200'), the VQ1 data segment 224, the VQ2 data 
segment 258, the enhancement location data segment 510, the 
25 VQ3 data segment 242, and the VQ4 data segment 244. 

The decoder 110 builds upon the Ym miniature 436, the Urn 
miniature 438 and the Xm miniature 44 0 calculated for the 
thumbnail miniature 120 by expanding the next layer of image 
detail. The next layer contains a portion of the DCT data 
3 0 segment 208, the VQ1 data segment 224, the VQ2 data segment 

258, the enhancement location data segment 510, the VQ3 data 
segment 242, and the VQ4 data segment 244 that correspond to 
the standard image . 

The inverse Huffman encoder 458 decompresses the DCT 
35 data segment 208 and the VQ1 data segment 224 (the DCT 

residual) . The combiner 452 combines the DCT information 
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from the inverse Huffman encoder 458 with the AC terms 200 
and the DC terms 201. The dequantizer 450 reverses the 
quantization process by multiplying the DCT quantized values 
206 with the quantization factors 478. The dequantizer 
5 obtains the correct quantization factors 478 from the 

quantization table Q 202. The dequantizer outputs 8x8 DCT 
coefficient blocks 482 to the inverse DCT 476. The inverse 
DCT 4 76 in turn, outputs the 8x8 DCT coefficient blocks 482 
that correspond to a Y image 509 that is l/4th the size of 
10 the original image. 

The pattern matcher 524 replaces the DCT residual blocks 
512 by finding an index to a matching pattern block in the 
codebook 214. The adder 456 adds the DCT residual blocks 512 
to the DCT coefficient blocks 482 on a pixel by pixel basis. 
15 The interpolator 462 interpolates the output of the adder 456 

by a factor of four to create a full size Y image 520. The 
interpolator 462. performs bilinear interpolation to enlarge 
the Y image 520 horizontally and vertically. 

The inverse Huffman encoder 458 decompresses the VQ2 
20 data segment 258 (the high resolution residual) and the 

enhancement location data segment 510. The pattern matcher 
524 uses the codebook indexes to retrieve the matching 
pattern blocks stored in the codebook 214 to expand the VQ2 
data segment 258 to create 16 x 16 high resolution residual 
25 blocks 514. An enhancement overlay builder 516 inserts the 

16 x 16 high resolution residual blocks into a Y image 
overlay 518 specified by the edge location data segment 510. 
The Y image overlay 518 is the size of the original image. 
The adder 456 adds the Y image overlay 518 to the full sized 
30 Y image 520. 

To calculate the full sized U image 522, the inverse 
Huffman encoder 4 58 expands the VQ3 data segment 242. The 
pattern matcher 524 uses the codebook indexes to retrieve the 
matching pattern blocks stored in the codebook 214 to expand 
35 the VQ3 data segment 242 into 4x4 rU_tau4 residual blocks 

526. The interpolator 462 interpolates the Urn miniature 438 
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The decoder 110 repeats the process illustrated in FIG. 27 
to generate a new full sized Y image 520 , a new full sized U 
image 522, and a new full sized X image 530. The new full 
sized Y image 520 is added to the full sized Y image 
5 generated in the third step 444. The new full sized U image 

522 is added to the full sized U image 522 generated in the 
third step 444. The new full sized X image 530 is added to 
the full sized X image generated in the third step 444. 

The inverse color converter 4 68 converts the full sized 
10 Y image 520, the full sized U image 522, and -the full sized 

X image 530 into a full sized red, green, and blue image. 
The panel is then added to the displayed image. This process 
is completed for each panel until the entire enhanced image 
105 is expanded. 

15 The inverse DCT 476 of the preferred embodiment is a 

mathematical transformation for mapping data in the time (or 
spatial) domain to the frequency domain, based on the 
"cosine" kernel. The two dimensional version operates on a 
block of 8 x 8 elements. 

20 Referring to FIG. 9, the compressed DCT coefficients 198 

are stored as DC terms 201 and AC terms 200. In the 
preferred embodiment, the inverse DCT 476 as shown in FIGs . 
25 and 27 combines the process of transformation and 
decimation in the frequency and spatial domains (frequency 

25 and then spatial) into a single operation in the frequency 

domain. The inverse DCT 476 of the present invention 
provides at least a factor of 2 in implementation efficiency 
and is utilized by the decoder 110 to expand the thumbnail 
miniature 120 and splash image 122. 

3 0 The inverse DCT 476 receives a sequence of DC terms 201 

and AC terms 200 which are frequency coefficients. The high 
frequency terms are arbitrarily discarded at a predefined 
frequency to prevent aliasing. The discarding of the high 
frequency terms is equivalent to a low pass filter which 

35 passes everything below a predefine frequency while 

attenuating all the high frequencies to zero. 
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All elements with i or j greater than 1 are set to zero. 
The setting of the high frequency index to zero is equivalent 
to filtering out the high frequency coefficients from the 
signal . 

Assigning Y as the 2x2 output matrix, the decimated 
output is thus equal to: 



Y 0 .o : -X 0>0 + (k x • (X 0>1 ) ) + (k x • (X 1>0 ) ) + (k, • (X x>1 ) ) 
Y 0 .i:=X 0 , 0 - (k x - (X 0>x ) ) + (k x - (X x-0 ) )-(k 2 - (X x>1 ) ) 
10 Y x . 0 :=X 0 , 0 +(k x - (X 0 .i) )-(k x - (X lf0 ))-{k,- (X 1<x ) ) 

Yi.i ;=x o.o-(V (X 0(l ))-( k i- (X x>0 )) + (k 2 - (X x>x )) 



where 

k.:=_L- (c(l)+c(3) +e(5) +c(7) ) c{k) =cos(tt- JL\ 

^: = (Jc x ) J 

The creation of a 4 x 4 output matrix where a given X is 
15 an 8 x 8 input matrix that consists of DC terms 201 and AC 

terms 200 is stated formally as: 

All elements with i or j greater than 3 are set to zero. 
It is possible to implement the calculations in the 
2x2 case where the two dimensional equation is decomposed 
2 0 downward; however, performing the one dimensional approach 

twice reduces complexity and decreases the calculation time. 
In the preferred embodiment, the inverse DCT 476 computes an 
additional one-dimensional row inverse DCT, and then a one- 
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dimensional column inverse DCT. 

The equation for a one dimensional case is as follows: 
(ldout x are the elements of the one dimensional case) 

5 ldout 0 : =in 0 + (k x ■ inj + (k 2 • in 2 ) + (k 3 • in 3 ) 

ldout^ino+Ck^inJ -(k 2 -in 2 ) -(k 5 -in 3 ) 
ldout 3 : =in 0 - (k 4 • in x ) - (k 2 • in 2 ) + (k 5 • in 3 ) 
ldout 3 : «in 0 - • in a ) + (k 2 • in 2 ) - (k 3 • in 3 ) 



10 



20 



- c(l) +c(3) c(2)+c(6) k . „ c(3) -c(7) 

. c(5)+c(7) u c(S>+c(l) 

K K : = —7= *5 * *" 7= 

'^2 s/2 



where c (k) is defined as in the 2x2 output matrix. 

The scaler 466 of the preferred embodiment is also shown 
in FIG. 27. More specifically, the scaler 466 utilizes a 
generalized routine that scales the image up or down while 
15 reducing aliasing and reconstruction noise. Scaling can be 

described as a combination of decimation and interpolation. 
The decimation step consists of downsampling and using an 
anti-aliasing filter; the interpolation step consists of 
pixel filling using a reconstruction filter for any scale 
factor that can be represented by a rational number P/Q, 
where P and Q are integers associated with the intei^olation 
and decimation ratios. 
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The scaler 466 decimates the input data by dividing the 
source image into the desired number of output pixels and 
then radiometrically weights the input data to form the 
necessary output. FIG. 28 illustrates the scaler 466 with an 
5 input to output ratio of five-to-three in the one dimensional 

case. Input pixel P x 538, pixel P 2 540, pixel P 3 542, pixel 
P 4 544, and pixel P s 546 contain different data values. The 
output pixel X 1 548, pixel X 3 550, and pixel X 3 552 are 
computed as follows: 
10 X x - P x + (P a ) (0.67) 

X 2 » (PjMO.33) + P 3 + (P 4 )(0.33) 
X 3 = (P«) (0.66) + P s 
The decimated data is then filtered with a 
reconstruction filter and an area average filter. The 
15 reconstruction filter interpolates the input data by 

replicating the pixel data. The area average filter then 
area averages by integrating the area covered by the output 
pixel . 

If the output ratio is less than 1 (i.e, interpolation 
20 is necessary) , the interpolator 462 utilizes bilinear 

interpolation. FIG. 29 illustrates the operation of the 
bilinear interpolation. Input pixel A 554, input pixel B 
556, input pixel C 558, and input pixel D 560, and reference 
point X 562 are interpolated to create output 564. For this 
25 example reference point X 562 is of to the right of pixel A 

554 and 1-a to the right of pixel C 558, and reference point 
X 562 is 0down from pixel A 554 and 1-/3 up from pixel B 556. 
Reference point X 562 is stated formally as: 

X - (1-a) * ((1-0)*A+0*B) + <** ( *C+/?*D) . 

3 0 The Imaae Classifier 

The preferred embodiment of the image classifier 152 is 
illustrated in FIG. 8. More specifically, the image 
classifier 152 uses fuzzy logic techniques to determine which 
compression methods will optimize the compression of various 
35 regions of the source image 100. The image classifier 152 

adds intelligence to the encoder 102 by providing the means 
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to decide, based on statistical characteristics of the image, 
what "tools" (combinations of compression methods) will best 
compress the image. 

The source image 100 may include a combination of 
5 different image types. For example, a photograph could show 

a person framed in a graphical border, wherein the person is 
wearing a shirt that contains printed text. In order to 
optimize the compression ratio for the regions of the image 
that contain different image types, the image classifier 152 
10 subdivides the source image 100 and then outputs the control 

script 196 that specifies the correct compression methods for 
each region. Thus, the image classifier 152 provides a 
customized, "most-efficient" compression ratio for multiple 
image types . 

15 The image classifier 152 uses fuzzy logic to infer the 

correct compression steps from the image content. Image 
content is inherently "fuzzy" and is not amenable to simple 
discrete classification. Images will thus tend to belong to 
several "classes." For example, a classification scheme 
20 might include one class for textual images and a second class 

for photographic images. Since an image may comprise a 
photograph of ' a person wearing a shirt containing printed 
text, the image will belong to both classes to varying 
degrees. Likewise, the same image may be high contrast, 
25 "grainy," black and white and/or high activity. 

Fuzzy logic is a set -theoretic approach to 
classification of objects that assigns degrees of' membership 
in a particular class. In classical set theory, an object 
either belongs to a set or it does not; membership is either 
30 100% or 0%. In fuzzy set theory, an object can be partly in 

one set and partly in another. The fuzziness is of greater 
significance when the content must be categorized for the 
purpose of applying appropriate compression techniques. 
Relevant categories in image compression include 
35 photographic, graphical, noisy, and high-energy- Clearly the 

boundaries of these sets are not sharp. A scheme that 
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matches appropriate compression tools to image content must 
reliably distinguish between content types that require 
different compression techniques, and must also be able to 
judge how to blend tools when types requiring different tools 
5 overlap. 

FIG. 30 illustrates the optimization of the compression 
process. The optimization process analyzes the input image 
600 at different levels. In the top level analysis 602 the 
image classifier 152 decomposes the image into a plurality of 
10 subimages 604 (regions) of relatively homogeneous content as 

defined by a classification map 606. The image classifier 
152 then outputs the control script 196 that specifies which 
compression methods or "tools" to employ in compressing each 
region. The compression methods are further optimized in the 
15 second level analysis 608 by the enhancement analyzer 144 

which determines which areas of an image are the most 
visually important (for example, text and strong luminance 
edges) . The compression methods are then further optimized 
in the third level analysis 610 with the optimized DCT 156, 
20 AVQ 134, and adaptive methods in the channel encoder 168. 

The second level analysis 608 and the third level analysis 
610 determine how to adapt parameters and tables to a 
particular image. 

The fuzzy logic image classifier 152 provides adaptive 
25 "intelligent" branching to appropriate compression methods 

with a high degree of computational simplicity. It is not 
feasible to provide the encoder 102 with an exhaustive 
mapping of all possible combinations of inherently non- 
linear, discontinuous, multidimensional inputs (image 
30 measurements) onto desired control scripts 196. The fuzzy 

logic image classifier 152 reduces such an analysis. 

Furthermore, the fuzzy logic image classifier 152 
ensures that the encoder 102 makes a smooth transition from 
one compression method (as defined by the control script 196) 
35 to another compression method. As image content becomes 

"more like" one class than another, the fuzzy controller 
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avoids the discrete switching from one compression method to 
another compression method. 

The fuzzy logic image classifier 152 receives the image 
data and determines a set of image measurements which are 
5 mapped onto one or more input sets. The image classifier 152 

in turn maps the input sets to corresponding output sets that 
identify which compression methods to apply. The output sets 
are then blended ("defuzzif ied") to generate a control script 
196 . The process of mapping the input image to a particular 
10 control script 196 thus requires three sets of rules: 1) 

rules for mapping input measurements onto input sets (e.g., 
degree of membership with the "high activity" input set =* F[ 
average of AC coefficients 56-63]); 2) rules for mapping 
input sets onto output sets (e.g., if graphical image, use 
15 DCT quantization table 5 and 3) rules for defuzzif ication 

that mediate between membership of several output sets, i.e,, 
how the membership of more than one output sets, should be 
blended to generate a single control script 196 that controls 
the compression process. 
20 Still further, the fuzzy logic rule base is easily 

maintained. The rules are modular. Thus, the rules can be 
understood, researched, and modified independently of one 
another. In addition, the rule bases are easily modified 
allowing new rules to make the image classifier 152 more 
25 sensitive to different types of image content. Furthermore, 

the fuzzy logic rule base is extendable to include additional 
image types specified by the user or learned using neural 
network or genetic programming methods. 

FIG. 31 illustrates a block diagram of the image 
30 classifier 152. In block 612 the image classifier 152 

determines a set of input measurements 614 that correspond to 
the source image 100. In order to determine the input 
measurements 614, the image classifier 152 sub-divides the 
source image 100 into a plurality of blocks. To conserve 
3 5 computations, the user can enable the image classifier 152 to 
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select a random sample of the plurality of blocks to use as 
the basis of the input measurements 614. 

The image classifier 1S2 determines the set of input 
measurements 614 from the plurality of blocks using a variety 
tof methods. The image classifier 152 calculates the mean, 
the variance, and a histogram of all three color components. 
The image classifier 152 performs a discrete cosine transform 
of the image blocks to derive a set of DCT components wherein 
each DCT coefficient is histogrammed to provide a frequency 
domain profile of the i|n£>uted image. The image classifier 
152 performs special convolutions to gather information about 
edge content, texture content, and the efficacy of the Reed 
Spline Filter. The image classifier 152 derives spatial 
domain blocks and matches the spatial domain blocks with a 
special VQ-like pattern list to provide information about the 
types of activity contained in the picture. Finally, the 
image classifier scans the image for common and possibly 
localized features that bear on the compressibility of the 
image (such as typed text or scanning artifacts) . 

In block 616 the image classifier 152 analyzes the input 
measurements 614 generated in block 612 to determine the 
extent to which the source image 100 belongs to one of the 
fuzzy input sets' 618 within the input rule base 620. The 
input rule base 620 identifies the list of image types. In 
the preferred embodiment, the image classifier 152 contains 
input sets 61B for the following image types: scale, text, 
graphics, photographic, color depth, degree of activity, and 
special features . 

Membership in the activity input set and the scale image 
input set are determined by the input measurements 614 for 
the DCT coefficient histogram, the spatial statistics, and 
the convolutions. Membership in the text image input set and 
the graphic input set correspond to the input measurements 
614 for a linear combination of high frequency DCT 
coefficients and gaps in the luminance histogram. The 
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photographic input set is the complement of the graphic input 



set 



The color depth input set includes four classifications: 
gray scale images, 4-bit images, 8-bit images and 24-bit 
5 images. The color depth input corresponds to the input 
measurements 614 for the Y, U and X color components. A 
small dynamic range in the U and X color components indicates 
that the picture is likely to be a gray scale image, while 
gaps in the Y component histogram reveals whether the image 
10 was once a palettized 4 -bit or 8 -bit image. 

The special feature input set corresponds to the input 
measurements 614 for the common or localized features that 
bear on the compressibility of the image. Thus the special 
feature input set identifies such artifacts as black borders 
15 caused by inaccurate scanning and graphical titling on a 

photographic image. 

in block 622 the image classifier 152 maps the input 
sets 618 onto output sets 624 according to the output rule 
base 626 . The image classifier 152 applies the output rule 
base 626 to map each input set 618 onto membership of each 
fuzzy output set 624. The output sets 624 determine, for 
example, how many CS terms are stored in the CS data segment 
204 and the optimization of the VQ1 data segment 224, the VQ2 
data segment 258, the VQ3 data segment 242, the VQ4 data 
25 segment 244, and the number of VQ patterns to use. The 
output sets also determine whether the encoder 102 performs 
an optimized DCT 136 and which quantization tables Q 202 to 
apply 

For the second Reed Spline Filter 225 and the third Reed 
30 Spline Filter 227, the output sets 624 adjust the decimation 

factor tau and the orientation of the kernal function 
Finally, the output sets determine whether the channel 
encoder 168 utilizes a fixed Huffman encoder, and adaptive 
Huffman encoder or an LZ1. FIG. 33 illustrates several 
35 examples of mapping from input measurements 614 to input sets 

618 to output sets 624. 
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Referring to FIG. 31 , in block 626 the image classifier 
constructs a classification map 628 based upon membership 
within the output sets. The classification map 628 
identifies independent regions in the source image 100 that 
5 are independently compressed. Thus the image classifier 152 

identifies the regions of the image that belong to compatible 
output sets 624. These are regions that contain relatively 
homogenous image contrast and call for one method or set or 
complementary methods to be applied to the entire region. 
10 In block 630 the image classifier 152 converts 

(defuzzif ies) , based on the defuzzif ication rule base 632, 
the membership of the fuzzy output sets 624 of each 
independent region in order to generate the control script 
196. The control script 196 contains instructions for which 
15 compression methods to perform and what parameters, tables, 

and optimization levels to employ for a particular region of 
the source image 100 . 
The Enhancement Analyzer 

The preferred embodiment of the enhancement analyzer 144 
20 is illustrated in FIGs. 4, 15 and 30. More specifically, the 

enhancement analyzer 144 examines the Y_tau2 miniature 190, 
the U_tau2 miniature 192, and the X_tau4 miniature 228 to 
determine the* enhancement priority of image blocks that 
correspond to 16 x 16 blocks in the original source image 
25 100. The enhancement analyzer 144 prioritizes the image 

blocks by 1) calculating the mean of the Y_tau2 miniature 
190, the U_tau2 miniature 192, and the X_tau4 miniature 228, 
and 2) testing every color block against a normalized 
threshold value E 252 for the Y_tau2 miniature 190, the 
30 U_tau2 miniature 192, and the X_tau4 miniature 228. A list 

of blocks that exceed the threshold value E 252 are added to 
the enhancement list 250. 

The enhancement analyzer 144 determines a threshold 
value E y for the Y_tau2 miniature 190, a threshold value E a 
35 for the U_tau2 miniature 192, and a threshold value E x for 

the X_tau4 miniature 228 . Once the enhancement analyzer 144 
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computes the threshold value E Y , the threshold value Eo and 
the threshold value E x , the enhancement analyzer 144 tests 
each 8x8 Y_tau2 block, each 4x4 U_tau4 block and each 
4x4 X_tau4 block (each block corresponds to a 16 x 16 block 
5 in the source image 100) as follows: 

Every pixel in the test block is convolved with 
the following filter masks: 

M x - {-1,-2,-1,0,0,0,1,2,1} 
M 2 = {1,0,-1,2,0,-2,1,0,-1} 
10 to compute two statistics S x and S a . 

Masks M x and M 2 are convolved with a three by three block 
of pixels centered on the pixel being tested. The three by 
three block of pixels is represented as: 

X ll X 12 ^lJ 

*21 X 22 

X 31 *32 X 33 

15 where the pixel x„ is the pixel being tested. Thus the 

statistics are calculated with the following equations: 

S x = (-i-x 11 )-(2-x ia )-(l-x xl > + (l-x n ) + (2-x M ) + (l-x M ) 
S, (l-x u ) -(l-x 13 ) + (2-x 21 ) -(l-x J3 ) + (l-x 31 ) -U-x 3J ) 
If S x plus S 2 is greater than the threshold value E Y for 
20 a particular 8x8 Y_tau2 block, the enhancement analyzer 144 

adds the 8x8 Y_tau2 block to the enhancement list 250. If 
Sj. plus S 2 is greater than the threshold value E 0 for a 
particular 4x4 U_tau4 block, the enhancement analyzer 144 
adds the 4x4 U_tau4 block to the enhancement list 250 . If 
25 S x plus S 2 is greater than the threshold value E x for 

particular 4x4 X_tau4 block the enhancement analyzer 144 
adds the 4x4 X_tau4 block to the enhancement list 250. 

In addition to the enhancement list 250, the enhancement 
analyzer 144 also uses the DCT coefficients 198 to identify 
3 0 visually unimportant "texture" regions where the compression 
ratio can be increased without significant loss to the image 
quality. 
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Optimized DCT 

The preferred embodiment of the optimized DCT 136 is 
illustrated in FIG. 9. More specifically, the optimized DCT 
136 uses the quantization table Q 202 to assign the DCT 
5 coefficients (DC terms 200 and AC terms 201) quantization 

step values* In addition, the quantization step values in 
the quantization table Q 202 vary depending on the optimized 
DCT 136 operation mode. The optimized DCT 136 operates in 
four DCT modes as follows: 1) switched fixed uniform DCT 
10 quantization tables that correspond to image classification, 

2) optimal reconstruction values, 3) adaptive uniform DCT 
quantization tables, and 4) adaptive non-uniform DCT 
quantization tables. 

The fixed DCT quantization tables are tuned to different 
15 image types, including eight standard tables corresponding to 

images differing along three dimensions: photographic versus 
graphic, small-scale versus large-scale, and high-activity 
versus low-activity. In the preferred embodiment, additional 
tables can be added to the resource file 160 (not shown) . 
20 The control script 196 defines which standard table the 

optimized DCT 136 uses in the fixed-table DCT mode. In the 
fixed- table mode, quantized step values for each DCT 
coefficient is obtained by linearly quantizing each x t DCT 
coefficient with the quantization value q A in quantization 
25 table Q. The mathematical relationship for the quantization 

procedure is: 

for i = 0, 1, . . . , 63 
if x t >=* 0, 



3 0 if x t < 0, 
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Reconstruction is also linear unless reconstruction 
values have been computed and stored in the CS data segment 
204. Letting r denote the dequantized DCT coefficients, the 
linear dequantization formula is: 
for i - 0, 1, ...» 63 
rt = Cj-qi 

in the fixed- table DCT mode, the optimized DCT 136 can 
also compute the optimal reconstruction values stored in the 
CS data segment 204. While the DC term 201 is always 
calculated linearly, the CS reconstruction values represent 
the conditional expected value of each quantized level of 
each AC term 200. The CS reconstruction values are 
calculated for each AC term 200 by first calculating an 
absolute value frequency histogram, H t for the ith 

... . _ ... ^ _ i 2 ,63) over all DCT blocks in 

coefficient (for i = i, <■> •■•> 

the source image, N, as follows: 

for j = 0, 1, . . . , N 

Hide)' = frequency (abstx^) = k) 

where'*, = the value of the ith coefficient in the 
20 jth DCT block. 

Second, the centroid of coefficient values is calculated 
between each quantization step. The formula for the centroid 
of the ith coefficient in the kth quantization interval is: 



CS X U)= Yj 



25 

where 
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This provides a non- linear mapping of quantized 
coefficients onto reconstructed values as follows: 

r 4 = CS t (q t ) for i » 1, 2, 63 

In the adaptive uniform DCT quantization mode, the image 
5 the classifier 152 outputs the control script 196 that 

directs the optimized DCT 136 to adjust a given DCT uniform 
quantization table Q 202 to provide more efficient 
compression while holding the visual quality constant. This 
method adjusts the DCT quantization step sizes such that the 
10 compressed bit rate (entropy) after quantizing the DCT 

coefficients is minimized subject to the constraint that the 
visually-weighted mean squared error arising from the DCT 
quantization is held constant with respect to the base 
quantization table and the user-supplied quantization 
15 parameter L. 

The optimized DCT 136 uses marginal analysis to adjust 
the DCT quantization step sizes. A "marginal rate of 
transformation (MRT) " is computed for each DCT coefficient. 
The MRT represents the rate at which bits are "transformed" 
20 into (a reduction of) the visually weighted mean squared 

error (VMSE) . The MRT of a coefficient is defined as the 
ratio of 1) the marginal change in the encoded bit rate with 
respect to a quantization step value q to 2) the marginal 
change in the visual mean square error with respect to the 
25 quantization step value q. 

MRT (bits/VMSE) ratio is calculated as follows: 

MRT (bits/VMSE) = ( (Abits/Aq) / (AVMSE/Aq) ) . 

3 0 Increasing the quantization step value q will add more 

bits to the representation of the corresponding DCT 
coefficient. However, adding more bits to the representation 
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of a DCT coefficient will reduce the VMSE. Since the bits 
added to the step value q are usually transformed into VMSE 
reduction, the MRT is generally negative. 

The MRT is calculated for all of the DCT coefficients. 
The adaptive method utilized by the optimized DCT 136 adjusts 
the quantization step values q of the quantized table Q 202 
by reducing the quantization step value q corresponding to 
the maximum MRT and increasing the quantization step value q 
corresponding to the minimum MRT. The optimized DCT 136 
repeats the process until the MRT is equalized across all of 
the DCT coefficients while holding the VMSE constant. 

FIG. 32 shows a flow chart of the process of creating an 
adaptive uniform DCT quantization table. In a step 700 the 
optimized DCT 136 computes the MRT values for all DCT 
coefficients i. In step 702 the optimized DCT 136 compares 
the MRT values, if the MRT values are the same, the optimized 
DCT 136 uses the resulting quantization table Q 202. If the 
MRT values are not equal, the optimized DCT 136 finds the 
minimum MRT value and the maximum MRT value for the DCT 
coefficients i in step 706. 

In step- 708, the optimized DCT 136 increases the 
quantization step value q l0tf corresponding to the minimum MRT 
value and decreases the quantization step value q„ lgh 
associated with the maximum MRT value. Increasing q lcm which 
reduces the number of bits devoted to the corresponding DCT 
coefficient but does not increase VMSE appreciably. 
Reducing the quantization step value q^ increases the 
number of bits devoted to the corresponding dCT coefficient 
and reduces the VMSE significantly. The optimized DCT 136 
offsets the adjustments for the quantization step values q low 
and qn igh in order to keep the VMSE constant. 

The optimized DCT 136 returns to step 700, where the 
process is repeated until all MRT values are equal . Once all 
of the quantization step values q are determined the 
resulting quantization table Q 202 is complete. 
F/^ H Spline Filter 
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FIGs. 34-57 illustrate a preferred embodiment of the 
Reed Spline Filter 138 which is advantageously used for the 
first, second and third Reed Spline Filters 148, 225, and 
227. The Reed Spline Filter described in FIG. 34 - 57 is in 
terms of .a generic image format. In particular the image 
input data comprises Y image input which corresponds for 
example to the red, green and blue image data in the first 
Reed Spline Filter 148 in the foregoing discussion. In like 
manner the outputs of the Reed Spline Filter 138 described as 
reconstruction values should be understood to correspond, for 
example, to the R_tau2 miniature 180, the G_tau2 miniature 
182 and the B_tau2 miniature 184 of the first Reed Spline 
Filter 138. 

The Reed Spline Filter is based on the a least -mean - 
15 square error (LMS) -error spline approach, which is extendable 

to N dimensions. One- and two-dimensional image data 
compression utilizing linear and planar splines, 
respectively, are shown to have compact, closed-form optimal 
solutions for convenient, effective compression. The 
20 computational efficiency of this new method is of special 
interest, because- the compression/reconstruction algorithms 
proposed herein involve only the Fast Fourier Transform (FFT) 
and inverse FFT types of processors or other high-speed 
direct convolution algorithms. Thus, the compression and 
25 reconstruction from the compressed image can be extremely 

fast and realized in existing hardware and software. Even 
with this high computational efficiency, good image quality 
is obtained upon reconstruction. An important and practical 
consequence of the disclosed method is the convenience and 
versatility with which it is integrated into a variety of 
hybrid digital data compression systems. 
I. SPLINE FILTER OVERVIEW 

The basic process of digital image coding entails 
transforming a source image X into a "compressed" image Y 
such that the signal energy of Y is concentrated into fewer 
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produces compressed data Y' , which is substantially free of 
aliasing prior to subsequent process steps. While the 
convolution or decimation filter 1014 attenuates aliasing 
effects, it does so by reducing the number of bits required 
5 to represent the signal. It is "low-pass" in nature, 

reducing the information content of the reconstructed image 
X' . Consequently, the residue AX 1012 will be larger, and in 
part, will offset the compression attained through 
decimation. 

10 The present invention disclosed herein solves this 

problem by providing a method of optimizing the compressed 
data such that the mean-square-residue <AX 2 > is minimized, 
where "< >" shall herein denote an averaging process. As 
shown in FIG. 36, compressed data Y' , generated in a manner 
similar to that shown in FIG. 35, is further processed by an 
optimization process 1018. Accordingly, the optimization 
process 1018 is dependent upon the properties of convolution 
filter 1014 and is constrained such that the variance of the 
mean-square-residue is zero, 6<AX 2 >=0 . The disclosed method 
of filter optimization "matches" the filter response to the 
image data, thereby minimizing the residue. Since the 
decimation filter 1014 is low-pass in nature, the 
optimization process 1018, in part, compensates by 
effectively acting as a "self -tuned" high-pass filter. A 
25 brief descriptive overview of the optimization procedure is 
provided in the following sections. 

A. Tmaae Approximation Spline Functions 
As will become clear in the following detailed 
description, the input decimation filter 1014 of FIG. 36 may 
be regarded as a projection of an image data vector X onto a 



20 



30 



-57- 



BNSDOCIO: <WO 9602e9SA1_l_> 



' PCTAJS95/08827 

WO 96/02895 



10 



15 



20 



25 



30 



set of basis functions that constitute shifted, but 
overlapping, spline functions } such that 



where X' is the reconstructed image vector and x* is the 
decomposition weight. The image data vector £ is thus 
approximated by an array of preferably ■ computationally 
simple, continuous functions, such as lines or planes, 
allowing also an efficient reconstruction of the original 
image . 

According to the method, the basis functions need not be 
orthogonal and are preferably chosen to overlap in order to 
provide a continuous approximation to image data, thereby 
rendering a non-diagonal basis correlation matrix: 

This property is exploited by the method of the present 
invention, since it allows the user to -adapt- the response 
of the filter by the nature and degree of cross-correlation. 
Furthermore, the basis of spline functions need not be 
complete in the sense of spanning the space of all image 
data, but preferably generates a close approximation to image 
X. it is known that the decomposition of image vector X into 
components of differing spline basis functions is not 

unique. The method herein disclosed optimizes the projection 
by adjusting the weights x* such that the differential 
variations of the average residue vanishes, 6<M 2 >=0, or 
equivalents <M*>=min. In general, it will be expected that 
a more complete basis set will provide a smaller residue and 
better compression, which, however, requires greater 
computational overhead and greater compression. Accordingly, 
it is preferable to utilize a computationally simple basis 
set, which is easy to manipulate in closed form and which 
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renders a small residual image. This residual image or 
residue AX. is preferably retained for subsequent processing 
or reconstruction. In this respect there is a compromise 
between computational complexity, compression, and the 
magnitude of the residue. 

In a schematic view, a set of spline basis functions 
S' = {**} may be regarded as a subset of vectors in the domain 
of possible image vectors S={x}, as depicted in FIG. 37. The 
decomposition on projection of X onto components of S' is not 
unique and may be accomplished in a number of ways. A 
preferable criterion set forth in the present description is 
a least -mean-square (LMS) error, which minimizes the overall 
difference between the source image X and the reconstructed 
image XI . Geometrically, the residual image £X can be 
15 thought of as a minimal vector in the sense that it is the 

shortest possible vector connecting X. to XI. That is, AJ£ 
might, for instance, be orthogonal to the subspace S' , as 
shown in FIG. 37. As it will be elaborated in the next 
section, the projection of image vector X onto S' is 
approximated by an expression of the form: 



The "best" X_l is determined by the constraint that M=X-X_1 is 
minimized with respect to variations in the weights Xj- 



20 



25 which by analogy to FIG. 37, described an orthogonal 
projection of X onto S' . 

Generally, the above system of equations which 
determines the optimal x* may be regarded as a linear 
transformation, which maps X onto S' optimally, represented 

3 0 here by: 
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where A lJ =*i**j is a transformation matrix having elements 
representing the correlation between bases vectors t 4 and * 
The optimal weights * are determined by the inverse 
operation A* 1 : 

5 rendering compression with the least residue. One Billed in 

th e art of LMS criteria will know how to express the 
processes given here in the geometry of multiple 
Hence, the processes described herein are applicable to a 

10 variety of image data types. Aiv . mnt : 
The present brief and general description has direct 
processing counterparts depicted in FIG. 36. The operation 

represents a convolution filtering process 1014, and 



15 



represents the optimizing process 1018 . ^ , , . 

in addition, as will be demonstrated in the following 
sections, the inverse operation A- is equivalent to a so- 

° Qver tQ the con3ugat e 

called inverse eigenfiiter wnen 

image domain. Specifically, 



DF T Xk = —DFT (X- * k (x) > • 



'° • „», f=m ; Ua r discrete Fourier transform (DFT) 

uhare DFT is the familiar qiswcuc 

whereon n „ MO f A The equivalent optimization 

and X m are the eigenvalues of A. me eq 

block 1018, shown in FIG. 38, comprises three steps- (1) a 
discrete Fourier transformation (DFT) 1020; (2) inverse 
25 eigenfiltering 1022; and (3) an inverse discrete Fourier 
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transformation (DFT" 1 ) 1024. The advantages of this 
embodiment, in part, rely on the fast coding/reconstruction 
speed, since only DFT and DFT" 1 are the primary computations, 
where now the optimization is a simple division. Greater 
elaboration into the principles of the method are provided in 
Section II where also the presently contemplated preferred 
embodiments are derived as closed form solutions for a one- 
dimensional linear spline basis and two-dimensional planar 
spline bases. Section III provides an operational 
description for the preferred method of compression and 
reconstruction utilizing the optimal procedure disclosed in 
Section II. Section IV discloses results of a reduction to 
practice of the preferred embodiments applied to one- and 
two-dimensional images. Finally, Section V discloses a 
preferred method of the filter optimizing process implemented 
in the image domain. 

II. IMAGE DATA COMPRESSION BY OPTIMAL SPLINE INTERPOLATION 

A. one -Dimensional D *<"* Compression — by — LMS -Error 

T.insar Splines 
For one -dimensional image data, bi-linear spline 
functions are combined to approximate the image data with a 
resultant linear interpolation, as shown in FIG. 39. The 
resultant closed- form approximating and optimizing process 
has a significant advantage in computational simplicity and 
25 speed. 

Letting the decimation index r and image sampling period 

t be fixed, positive integers r,t=l,2 and letting X(t) 

be a periodic sequence of data of period nr, where n is also 
an integer, consider a periodic, linear spline 1014 of period 
30 nr of the type, 

F( t) = F(t+HT) , (1) 



20 



where 
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(2) 



as shown by the functions *>(t) 1014 of FIG. 39. 

7e fLily of shifted linear splines F U) i. defied as 



follows : 



= F(t-kr) for (k=0,l,2 (n-D ) • < 3 > 



One object of the present embodiment is to approximate X(t) 
by the n-point sum: 



n ' x (A) 

sit) = £x k * k (t), 



Jc«0 



in a least-mean-squares fashion where X, X.-x are n 

. observe that the two-point sum m 

reconstruction weights. Observe 

10 the interval 0<t<T is: 

x 0 ^ 0 (O + x 1 * 1 (t)=x 0 (i-i)^ 1 (i-i^li) {s) 



Hence S(t) 1030 in Equation 4 represents a linear 
interpolation of the original waveform X(t) 1002, as shown in 

15 FIG ' To' find the "best" weights X 0 X..,. the quality 

LCX^X, X..x) is minimized: 

w£(^«-S wt, I>- (6> 

where the sum has been ta*en over one period plus t of the 
data. X, is minimized by differentiating as follows. 

20 
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\ |c«-t *-0 f-r / 



(7) 



This leads to the system, 

n-l 

E 



of linear equations for X k , where 

nr 

A }k = £*j(t)Vt) for (j, Jc»0,l ,n-l) 

and 

nr 

y j = ]C X(t) *J (t) for(j'=O f l f . . .,n-l) 



(9) 



(10) 



The term Y i in Equation 10 is reducible as follows: 

JIT 

Y r £ x(t)F(t-jT) 

(ID 

e-o-n » 

Letting (t-jT) = m, then: 

y, = 52 X(n> + jT)F(m) for (j=0, 1,2, ... ,n-l) . (12) 
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The Y/s in Equation 12 represent the compressed data to 
be transmitted or stored. Note that this encoding scheme 
involves n correlation operations on only 2t-1 points. 

Since F(t) is assumed to be periodic with period nr, 
the matrix form of A, k in Equation 9 can be reduced by 
substitution Equation 3 into Equation 9 to obtain: 



Ajk = £ F(m+(j-k)T)F(m) 

T-l 

£ (F(m) ) 2 A a if j-JosO mod n 



t-1 



£ F(m±r)F{m) * 0 if j-km±l mod n 
q otherwise 



(13) 



10 



By Equation 13, A, k can be expressed also in circulant form 
in the following manner: 



(14) 



where (k-j) n denotes (k-j) mod n, and 

a = o, a 1 = 0, a 2 = 0, . . . , = 0 



(15) 
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Therefore, in Equations 14 and 15 has explicitly the 
following equivalent circulant matrix representations: 



[A jk ] A 







... 






^1,0 ^1.1 














- 










a 0 a i 


3 2 


... a„ 




3„-i a o 


a i 


... a n 




3 n-2 a n-l 3 0 








a x a 2 


a 3 


a 



(16) 



a 0 0 ■•• /3 

fi a 0 - 0 

D 0 a - 0 

S 0 0 of. 



10 



15 



One skilled in the art of matrix and filter analysis 
will appreciate that the periodic boundary conditions imposed 
on the data lie outside the window of observation and may be 
defined in a variety of ways. Nevertheless, periodic 
boundary conditions serve to simplify the process 
implementation by insuring that the correlation matrix [A^] 
has a calculable inverse. Thus, the optimization process 
involves an inversion of [A^J , of which the periodic boundary 
conditions and consequent circulant character play a 
preferred role. It is also recognized that for certain 
spline functions, symmetry rendered in the correlation matrix 
allows inversion in the absence of periodic image boundary 
conditions . 

B. Two -Dimensional Daf* Compre ssion bv Planar Splines 
For two-dimensional image data, multi-planar spline 
functions are combined to approximate the image data with a 
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resultant planar interpolation. In FIG. 40, X(t lf t 2 ) is a 
doubly periodic array of image data (e.g., still image) of 
periods n^r and n 2 r , with respect to the integer variables t, 
and t 2 where t is a multiple of both t x and t 2 . The actual 
image 1002 to be compressed can be viewed as being repeated 
periodically throughout the plane as shown in the FIG. 40. 
Each subimage of the extended picture is separated by a 
border 1032 (or gutter) of zero intensity of width r. This 
border is one of several possible preferred "boundary 
conditions" to achieve a doubly-periodic image. 

Consider now a doubly periodic planar spline, F(t x , t 2 ) 
which has the form of a six-sided pyramid or tent, centered 
at the origin and is repeated periodically with periods n x T 



with respect to integer variables t x 



and 



and n 2 T 

respectively. A perspective view of such a planar spline 
function 1034 is shown in FIG. 41a and may hereinafter be 
referred to as "hexagonal tent." Following the one- 
dimensional case by analogy, letting: 



(17) 



20 for (k 1= 0,l n t -l) and (k 2 =0,l n 2 -l) , the "best- 

weights X klk2 are found such that: 



L( 



n,T,n,T / 



(18) 



is a minimum. 
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A condition for L to be a minimum is 



dL _ 



Tx 



n,T,/i,r 



11,-1.11,-1 

X(ti.t,) " £ *kA**A< t i' t a) 



= 2 



52 x(t ir t 2 )^ Jl (t 1 , t 2 ) 

11,-1.11,-1 n,r,n,r 



m 0 



(19) 



The best coefficients X klk2 are the solution of the 2nd-order 
tensor equation, 



(20) 



where the summation is on k x and k 2 , 

E 

t,.e,— t 



(21) 
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and 



(22) 



With the visual aid of FIG. 41a, the tensor Y Jlja reduces 
as follows: 



e,.e,«-r 



Letting t^-j^m^ 



for k = 1,2, then 



T-l 

y = Y Xim^ +j 1 t, mj+ jjT) F(m 1 ,iTi ! ) 



m,.iv 



(23) 



(24) 



for (j x = 0,1 11,-1) and (j a = 0,1 n,-l) , where FO^.m,) 

is the doubly periodic, six-sided pyramidal function, shown 
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in FIG. 41a. The tensor transform in Equation 21 is treated 
in a similar fashion to obtain 

n t r ( n ) r 

A jJ,k t k t = £ *J |P j, t t 2 ) ( t 1# t 2 ) 
t-1 

£ Fd^M Jfi-^i) T,nij+( j 2 -Jc a ) t) Fiin,,!^) 

Tt w m J «-T*l 

if (Ji-fcj) = 0 mod n x A ( j 2 -ic 2 ) s 0 mod 

T-l 

]T Fiw^r,^) F{rn L ,m 2 ) A )3 

if (j x -k x ) s ±1 mod n x A ( j 2 ~k 2 ) m 0 mod 
52 Fdn^jn^T) Fim lt m 2 ) * 7 

~" m l ,m 1 »-T*l 

if ij^k^ m 0 mod n x A (j 2 -£ 2 ) » ±1 mod 

r-i 

if (j'i-^) = ±1 mod n x A (j 2 -k 2 ) a ±1 mod 

T-l 

52 F(m 1 : ;T / jn 2 ±T) Fdn^ir^) at? 

if (j^-Jq) a Tl mod n x A (j 2 --k 2 ) a ±1 mod i^. 



(25) 

The values of a, p, y , and £ depend on t, and the shape 
5 and orientation of the hexagonal tent with respect to the 

image domain, where for example nr^ and m 2 represent row and 
column indices. For greater flexibility in tailoring the 
hexagonal tent function, it is possible to utilize all 
parameters of the [A^^,^] . However, to minimize 

10 calculational overhead it is preferable to employ symmetric 

hexagons, disposed over the image domain with a bi- 
directional period 7. Under these conditions, (3=y = £ and 77 = 0, 
simplifying [Aj lj3!clM ] considerably. Specifically, the 

hexagonal tent depicted in FIG. 41a and having an orientation 
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depicted in FIG. 41b is described by the preferred case in 
which /3=7=£ and tj=0. It will be appreciated that other 
orientations and shapes of the hexagonal tent are possible, 
as depicted, for example, in FIG. 41c. Combinations of 
hexagonal tents are also possible and embody specific 
preferable attributes. For example, a superposition of the 
hexagonal tents shown in FIG. 41b and 41c effectively 

"symmetrizes" the compression process. 

From Equation 25 above, A, lJaklka can be expressed in 

circulant form by the following expression: 

» - a (26) 



where 



(k,-j,)„, denote (kj - ji) mod n r , f-1,2, and 



*oo 



»10 



*12 



*S1 



* /3 0 
3 0 0 
0 0 0 

0 0 0 

poo 



- 0 {3 
... o 0 
... o 0 

... o 0 

- 0 j3 



*2. 11,-1 



a n,-l.o a n,-l.l a n,-1.2 "' a n,-l.n,-l 



(27) 



15 



where (a x = 0,1,2,... n,-!) and (s, = 1,2,3 n,-l) . Note 

that when [a.,..,] is represented in matrix form, it is "block 

circulant . " 

C. mnrnressi on-Recons f -ruction Algorithms 

Because the objective is to apply the above-disclosed 

LMS error linear spline interpolation techniques to image 
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sequence coding, it is advantageous to utilize the tensor 
formalism during the course of the analysis in order to 
readily solve the linear systems in equations 8 and 20. 
Here, the tensor summation convention is used in the analysis 
5 for one and two dimensions. It will be appreciated that such 

convention may readily apply to the general case of N 
dimensions. 

1. kine^r Trying foymatjpn of T^ggyg 

A linear transformation of a Ist-order tensor is written 

10 as 

y r = A rs X 9 (sum on s) , (28) 



15 



where A,, is a linear transformation, and Y r ,X a are lst-order 
tensors. Similarly, a linear transformation of a second 
order tensor is written as: 

y r.r, = Ar lW , x -.', (sum on s lt s 7 ) . (29) 



The product or composition of linear transformations is 
defined as follows.. When the above Equation 29 holds, and 

Z aa ■ B atrer Y pr * (30) 



then 



7 - B A X 



(31) 



20 



Hence, 



n - n a 



(32) 



is the composition or product of two linear transformations. 

2 . Circulant Transformation of lst-Order Tensors 

-71- 



BNSOOCtD: <WO 9602895A1_I_> 



■ PCT/US95/08827 ( 

WO 96/02895 T 

i, 1 it > 



The tensor method for solving equations 8 and 20 is 
illustrated for the i -dimensional case below: 
Letting A,, represent a circulant tensor of the form: 
A„=a ( .. r)Mdn fortr. s-0.1,2 n-1) . 



(33) 



and considering the n special ist-order tensors as 
W {t) a ("')* for (f-0,1,2, . . . ,n-l> , 

3 



(34) 



where u is the n-th root of unity, then 

A y" =X(£)<> , 



(n (35) 

= A I u n 

rs" S 



where 

n-l 



JO 



(36) 



the distinct eigenvalues of A,.. The terms W^J are 



are 



orthogonal . 

0 for l*j (37) 



n for t = j 



At this point it is convenient to normalize these 
15 tensors as follows: 

jn. l/) for (£-0.1,2 n-l). (38) 
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^ s 
i.e. 



evidently also satisfies the orthonormal property, 

♦'iV:*." ■"■> ... (3S " 

where 6 M is the Kronecker delta * function and * represents 
complex conjugation. 

A linear transformation is formed by summing the n dyads 

^(O^UM for g . o,l,...,n-l under the summation sign as 



r ^ s 
follows 



10 Then 



A » _ 
/-o 



n-l 



**0 



(41) 



n-i 



= X(j)<^ r (j) . 

Since A„ has by a simple verification the same eigenvectors 
and eigenvalues as the transformation A,, has in Equations 9 
and 33, the transformation A,, and A„ are equal. 
15 3. Inverse Transformation of lst- Order Tensors. 

The inverse transformation of A T , is shown next to be 

(42) 
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<w as shown below: 
, „- nve n easily* * s 
This is proven ^ r- 

the solute o£ ^ 



so that 



10 



15 



* discrete Fourier Transfer* and DFT 

. - isrr r ;-,:::::tnr=r» 

""rat 1 ""^^""raLle iet X denote i» 

methods, a A ma ,- r i X For example, x Ais 

be represented tri „s £ o ^ 

------ "-"^rs^-ri, «0 

also a cixu _ rix is -similar he com plex 

every 

circular* — ^ menS ion <nxrrt . ana u ~~ 
denotes the DFT matrix 
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conjugate of the DFT matrix, and A is defined to be the 
eigenmatrix of A, then: 

A = QAQ ? . 



(46) 



10 



15 



The solution to y - Ax is then 



For the one -dimensional process 
eigenvalues of the transformation operators are 



described above, the 



n-l 



X(0 = £aj(w')* 
= DFT ( a j ) . 



where a 0 =a, a 1= 0 a n . 2 =0, a^, and . Hence: 

= a+/3 (u'+cj"') • 



(47) 



(48) 



A direct extension of the 1st -order tensor concept to 
the 2nd-order tensor will be apparent to those skilled in the 
art. By solving'the 2nd-order tensor equations, the results 
are extended to compress a 2-D image. FIG. 42 depicts three 
possible hexagonal tent functions for 2 -dimensioned image 
compression indices t-2,3,4. The following table exemplifies 
the relevant parameters for implementing the hexagonal tent 
functions 





Decimation Index 


T-2 


T-3 


r-4 


20 


(r) 










Compression Ratio 


4 


9 


16 














a 


a 2 +6b 2 


a 2 +6b 2 +12c 2 


a 2 +6b 2 

+ 12c 2 +18d 2 




0 


b 2 


2 (c 2 +bc) 


2d 2 + 2db 
+4dc+c 2 
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substantially reduce computation overhead associated with 
conjugate transform operations. Typically, such an 

improvement is given by the ratio of computation steps 
required to transform a set of N elements: 

- log 2 {N) 

HI = 2 = ^rlog 2 (N) , 

DFT N 2 2N 

5 

which improves with the size of the image. 
A. The Compression Method 

The coding method is specified in the following steps: 

1, A suitable value of r (an integer) is chosen. The 
10 compression ratio is r 2 for two-dimensional images. 

2. Equation 23 is applied to find Y juj2 , which is the 
compressed data to be transmitted or stored: 

t,.c,-T 

B. The Reconstruct ion Method 

15 The reconstruction method is shown below in the 

following steps: 

1. Find the FFT* 1 of Y J1(ja (the compressed data) . 

2 . The results of step 1 are divided by the 
eigenvalues X(£,m) set forth below. The 

20 eigenvalues XU,m) are found by extending Equation 

48 to the two-dimensional case to obtain: 
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where u x is the n x -th root of unity and w 2 is the 
n 2 -th root of unity. 

The FFT of the results from step 2 is then taken. 
After computing the FFT, X klk2 (the optimized 
weights) are obtained. 

The recovered or reconstructed image is: 

s(tx,t a ) 1 x kik ^ kikt (t lt t 2 ) . (50) 

Preferably, the residue is computed and retained 
with the optimized weights: 

LXit lt t a > = X(t lt t 3 ) -S(t lf tj) . 



Although the optimizing procedure outlined above appears to 
be associated with an image reconstruction process, it may be 
implemented at any stage between the aforementioned 
compression and reconstruction. It is preferable to implement 
15 the optimizing process immediately after the initial 

compression so as to minimize the residual image. The 
preferred order has an advantage with regard to storage, 
transmission and the incorporation of subsequent image 
processes . 

20 C. Response Consid erations 

The inverse eigenfilter in the conjugate domain is 
described as follows: 

»<i i) - 1 . (51) 



where X(i,j) can be considered as an estimation of the 
frequency response of the combined decimation and 
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interpolation filters. The optimization process H(i,j) 
attempts to "undo" what is done in the combined 
decimation/ interpolation process. Thus, H(i,j) tends to 
restore the original signal bandwidth. For example, for t-2, 
the decimation/ interpolation combination is described as 
having an impulse response resembling that of the following 
3x3 kernel: 



R = 



0/3/J 
0 a P 
P P 0 



(52) 
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15 



20 
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Then, its conjugate domain counterpart, X (i, j ) | „,„,,,, will be 



Mi. j) I..,.* - <* + 2 0 



(53) 



where i,j are frequency indexes and N represents the number 
of frequency terms. Hence, the implementation accomplished 
in the image conjugate domain is the conjugate equivalent of 
the inverse of the above 3x3 kernel. This relationship will 
be utilized more. explicitly for the embodiment disclosed in 
Section V. 

IV. NUMERICAL SIMULATIONS 

A. One-Dinipn s:i - ona l Case 

For a one -dimensional implementation, two types of 
signals are demonstrated. A first test is a cosine signal 
which is useful for observing the relationship between the 
standard error, the size of t and the signal frequency. The 
standard error is defined herein to be the square root of the 
average error: 



A second one-dimensional signal is taken from one line of a 
grey-scale still image, which is considered to be realistic 
data for practical image compression. 
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process. FIG. 51 shows the error plots as functions of r for 
both images . 

An additional aspect of interest is to look at the 
optimized weights directly. When these optimal weights are 
5 viewed in picture form, high-quality miniatures 1080, 1082 of 

the original image are obtained, as shown in FIG. 52. Hence, 
the present embodiment is a very powerful and accurate method 
for creating a "thumbnail" reproduction of. the original 
image . 

10 V. ALTERNATIVE EMBODIMENTS 

Video compression is a major component of high- 
definition television (HDTV) . According to the present 
invention, video compression is formulated as an equivalent 
three-dimensional approximation problem, and is amenable to 
15 the technique of optimum linear or more generally by 

hyperplanar spline interpolation. The main advantages of 
this approach are seen in its fast speed in 
coding/reconstruction, its suitability in a VLSI hardware 
implementation, and a variable compression ratio. A 

2 0 principal advantage of the present invention is the 

versatility with which it is incorporated into other 
compression systems. The invention can serve as a " front - 
end" compression platform from which other signal processes 
are applied. Moreover, the invention can be applied 
25 iteratively, in multiple dimensions and in either the image 

or image conjugate domain. The optimizing method can for 
example apply to a compressed image and further applied to a 
corresponding compressed residual image. Due to the inherent 
low-pass filtering nature of the interpolation process, some 

3 0 edges and other high-frequency features may not be preserved 

in the reconstructed images, but which are retained through 
the residue. To address this problem, the following 
procedures are set forth: 

Procedure (a) 
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resulting equivalent bandwidths are greatly reduced. Since 
the subbands have only low frequency components, one can use 
the above described, linear or planar spline, data 
compression technique for coding these data, A 16 -band 
filter compression system is shown in FIG. 54, and the 
corresponding reconstruction system in FIG. 55. There are, 
of course, many ways to implement this filter bank, as will 
be appreciated by those skilled in the art. For example, a 
common method is to exploit the Quadrature Mirror Filter 
structure. 

V. IMAGE DOMAIN IMPLEMENTATION 

The embodiments described earlier utilize a spline 
filter optimization process in the image conjugate domain 
using an FFT processor or equivalent thereof. The present 
invention also provides an equivalent image domain 
implementation of a spline filter optimization process which 
presents distinct advantages with regard to speed, memory and 
process application. 

Referring back to Equation 45, it will be appreciated 
that the transform processes DFT and DFT* 1 may be subsummed 
into an equivalent conjugate domain convolution, shown here 
briefly: 



Furthermore, with A^DFTta^ , the optimization process 
may be completely carried over to an image domain 
implementation knowing only the form of the input spline 
filter function. The transform processes can be performed in 




(54) 



If = DFT (1/AJ , then: 



X j = DFTIDFT-^Q) DFT' 1 ( Y k ) ] 
= Q*Y k . 
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g, and h may be set to zero with little noticeable effect in 
the reconstruction. 

The principal advantages of the present preferred 
embodiment are in computational saving above and beyond that 
5 of the previously described conjugate domain inverse 

eigenfilter process (FIG. 38, 1018) . For example, a two- 
dimensional FFT process may typically require about N 2 log 2 N 
complex operations or equivalently 6N 2 log 2 N multiplications. 
The total number of image conjugate filter operations is of 

10 order 10N 2 log 2 N. On the other hand, the presently described 

(7x7) kernel with 5 distinct operations per image element 
will require only 5N 2 operations, lower by an important 
factor of log 2 N. Hence, even for reasonably small images, 
there is significant improvement in computation time. 

15 Additionally, there is substantial reduction in buffer 

demands because the image domain process 103 8 requires only 
a 7x7 image block at a given time, in contrast to the 
conjugate process which requires a full-frame buffer before 
processing. In addition to the lower demands on computation 

20 with the image domain process 1038, there is virtually no 

latency in transmission as the process is done in pipeline. 
Finally, "power of 2" constraints desirable for efficient FFT 
processing is eliminated, allowing convenient application to 
a wider range of image dimensions. 

25 The above detailed description is intended to be 

exemplary and not limiting. From this detailed description, 
taken in conjunction with the appended drawings, the 
advantages of the present invention will be readily 
understood by one who is skilled in the relevant technology. 

3 0 The present apparatus and method provides a unique encoder, 

compressed file format and decoder which compresses images 
and decodes compressed images. The unique compression system 
increases the compression ratios for comparable image quality 
while achieving relatively quick encoding and decoding times, 
35 optimizes the encoding process to accommodate different image 

types, selectively applies particular encoding methods for a 
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particular image type, layers the image quality components in 
the compressed image, and generates a file format that allows" 
the addition of other compressed data information. 

While the above detailed description has shown, 
5 described and pointed out the fundamental novel features of 

the invention as applied to various embodiments, it will be 
understood that various omissions and substitutions and 
changes in the form and details of the illustrated device may 
be made by those skilled in the art, without departing from 
10 the spirit of the invention. 
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particular image type, layers the image quality components in 
the compressed image, and generates a file format that allows 
the addition of other compressed data information. 

While the above detailed description has shown, 
5 described and pointed out the fundamental novel features of 

the invention as applied to various embodiments, it will be 
understood that various omissions and substitutions and 
changes in the form and details of the illustrated device may 
be made by those skilled in the art, without departing from 
10 the spirit of the invention. 
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g, and h may be set to zero with little noticeable effect in 

the reconstruction. 

The principal advantages of the present preferred 
embodiment are in computational saving above and beyond that 
5 of the previously described conjugate domain inverse 
eigenfilter process (FIG. 38, 1018). For example, a two- 
dimensional FFT process may typically require about N log,N 
complex operations or equivalent^ 6N>log 2 N multiplications. 
The total number of image conjugate filter operations is of 
10 order 10N'log s N. On the other hand, the presently described 

(7x7) kernel with 5 distinct operations per image element 
will require only 5N> operations, lower .by an important 
factor of log,N. Hence, even for reasonably small images, 
there is significant improvement in computation time. 
15 Additionally, there is substantial reduction in buffer 

demands because the image domain process 1038 requires only 
a 7X7 image block at a given time, in contrast to the 
conjugate process which requires a full-frame buffer before 
processing. In addition to the lower demands on computation 
20 with the image domain process 1038. there is virtually no 

latency in transmission as the process is done in pipeline. 
Finally, "power of 2" constraints desirable for efficient FFT 
processing is eliminated, allowing convenient application to 
a wider range of image dimensions. 
25 Th e above detailed description is intended to be 

exemplary and not limiting. From this detailed description, 
taken in conjunction with the appended drawings, the 
advantages of the present invention will be readily 
understood by one who is skilled in the relevant technology. 
30 The present apparatus and method provides a unique encoder, 

cbmpressed file format and decoder which compresses images 
and decodes compressed images. The unique compression system 
increases the compression ratios for comparable image quality 
while achieving relatively quick encoding and decoding times, 
35 optimizes the encoding process to accommodate different image 

types, selectively applies particular encoding methods for a 
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advance to generate the image domain equivalent of the 
inverse eigenfilter. As shown in FIG. 57, the image domain 
spline optimizer Q operates on compressed image data Y' 
generated by a first convolution process 1014 followed by a 
decimation process 1016, as previously described. Off-line 
or perhaps adaptively, the tensor transformation A (as shown 
for example in Equation 25 above) is supplied to an FFT type 
processor 1032, which computes the transformation eigenvalues 
X . The tensor of eigenvalues is then inverted at process 
block 1034, followed by FFT* 1 process block 1036, generating 
the image domain tensor fi. The tensor Q is supplied to a 
second convolution process 1038, whereupon Q is convolved 
with the non-optimized compressed image data Y ' to yield 
optimized compressed image data Y' ' . 

In practice, there is a compromise between accuracy and 
economy with regard to the specific form of Q. The optimizer 
tensor Q should be of sufficient size for adequate 
approximation of: 



DFT 



1 [dft(a) ) 



20 On the other hand, the term Q should be small enough to be 

computationally tractable for the online convolution process 
1038. It 'has been found that two-dimensional image 
compression using the preferred hexagonal tent spline is 
adequately optimized by a 5x5 matrix, and preferably a 7x7 
25 matrix, for example, with the following form: 

•0 h -g g e e g* 
h £ e d c d e 
-g e c b b c e 
Q=]gdbabdg 
e c b b c e -g 
e d c d e f h 
g e e g -g h 0 

Additionally, to reduce computational overhead, the smallest 
elements (i.e., the elements near the perimeter) such as f, 
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V IMAGE DOMAIN IMPLEMENTATION 
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briefly: 



20 



P J (54) 

= DFT ^FT" 1 ^(^^DFT-MY*)^ 



If 0 - DFT (1/XJ , then: 



25 . v _ DFT (aJ, the optimization process 

trover to an image domain. 
ma y be completely spline 
irop lementation knowing onl* ^ the f o performed in 

filter function. The transform processes c 
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Since the theoretical formulation, derivation, and 
implementation of the disclosed compression method do not 
depend strongly on the choice of the interpolation kernel 
function, other kernel functions can be applied and their 
performances compared. So far, due to its simplicity and 
excellent performance, only the linear spline function has 
been applied. Higher-order splines, such as the quadratic 
spline, cubic spline could also be employed. Aside from the 
polynomial spline functions, other more complicated function 
forms can be used. 

Procgdyre (fr) 

Another way to improve the compression method is to 
apply certain adaptive techniques. FIG. 53 illustrates such 
an adaptive scheme. For a 2-D image 1002, the whole image 
can be divided into subimages of smaller size 1084. Since 
different subimages have different local features and 
statistics, different compression schemes can be applied to 
these different subimages. An error criterion is evaluated 
in a process step 1086. If the error is below a certain 
threshold determined in a process step 1088, a higher 
compression ratio is chosen for that subimage. If the error 
goes above this threshold, then a lower compression ratio is 
chosen in a step 1092 for that subimage. Both mult i -kernel 
functions 1090 and multi-local-compression ratios provide 
good adaptive modification. 

Procedure (c) 

Subband coding techniques have been widely used in 
digital speech coding. Recently, subband coding is also 
applied to digital image data compression. The basic 
approach of subband coding is to split the signal into a set 
of frequency bands, and then to compress each subband with an 
efficient compression algorithm which matches the statistics 
of that band. The subband coding techniques divide the whole 
frequency band into smaller frequency subbands . Then, when 
these subbands are demodulated into the baseband, the 

-82- 



10 



PCT/US95/08827 

WO 96/02895 



pr ocess. FIG . 5! shows the error plots as functions of r for 

both images. in terest is to look at the 
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V. ALTERNATIVE EMBODIMENTS 

•«« i<s a major component of high- 
Video oppression „ . ^ ^ ^ 

definition television (HDTV) . ^ equivalent 
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FIG. 47 shows the plots of standard error versus 
frequency of the cosine signal for different degrees of 
decimation r 1056. The general trend is that as the input 
signal frequency becomes higher, the standard error 
5 increases. In the low frequency range, smaller values of r 

yield a better performance. One abnormal phenomenon exists 
for the t=2 case and a normalized input frequency of 0,25. 
For this particular situation, the linear spline and the 
cosine signal at discrete grid points can match perfectly so 

10 that the standard error is substantially equal to 0. 

Another test example comes from one line of realistic 
still image data. FIG. 48a and 48b show the reconstructed 
signal waveform 1060 for 7 = 2 and t=4, respectively, 
superimposed on the original image data 1058. FIG. 48a shows 

15 a good quality of reconstruction for r=2. For t=4, in FIG. 

48b, some of the high frequency components are lost due to 
the combined decimation/interpolation procedure. FIG. 48c 
presents the error plot 1062 for this particular test, 
example. It will be appreciated that the non-linear error 

20 accumulation versus decimation parameter r may be exploited 

to minimize the combination of optimized weights and image 
residue . 

B . Two-Dimensional Case 

For the two-dimensional case, realistic still image data 
25 are used as the test. FIG. 4 9 and 50 show the original and 

reconstructed images for t=2 and t=4. For r«2, the 
reconstructed image 1066, 1072 is substantially similar to 
the original. However, for t=4, there are zig-zag patterns 
along specific edges in images. This is due to the fact that 
30 the interpolation less accurately tracks the high frequency 

components. As described earlier, substantially complete 
reconstruction is achieved by retaining the minimized residue 
AX and adding it back to the approximated image. In the next 
section, several methods are proposed for implementing this 
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interpolation filters. The optimization process HU.il 
Attempts to "undo" what is- done in the coined 
a d Ztion/interpoiation process. Thus. H.l.J. » 
restore the original signal bandwidth. For example, for -2. 
h decimation/ interpolation combination is 

having an impulse response reselling that of the following 

3x3 kernel: 



25 



R= 



0 0 0 

MP 

0/3 0 



(52) 



Then, its conjugate domain counterpart, X (i.j)l „.,... win be 



cos(^).cos(^).=os[ 2 »(i-i)] 



(53) 



where i,j are- f regency indexes and N represents the number 
of frequency terms. Hence, the implementation accomplished 
in the^magl conjugate domain is the conjugate equivalent of 
Che inverse of the above 3,3 kernel. This relationship will 
the inverse ,.^, Mv £or the embodiment disclosed in 

be utilized mor.e explicitly tor tne 

Section V. 

IV. NUMERICAL SIMULATIONS 

A. h ^p -Pimp " 0 * rma1 Case 

For a one-dimensional implementation, two types of 
signals are demonstrated. A first test is a cosine srgnal 
whfch is useful for observing the relationship between the 
standard error, the size of , and the signal £ 
standard error is defined herein to be the sguare root of the 

average error: 

r J n 
i Y (AX(t)) J ' 



A second one-dimensional signal is taken from one line of a 
grey-scale still image, which is considered to be realistic 
data for practical image compression. 
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where u x is the n x -th root of unity and w 2 is the 
n 2 -th root of unity. 

The FFT of the results from step 2 is then taken. 
After computing the FFT, X klk2 (the optimized 
weights) are obtained. 

The recovered or reconstructed image is: 

s(t lf t,) = ' £ x ki ^ kik> (t lt t 2 ) . (50) 

Preferably, the residue is computed and retained 
with the optimized weights: 

AX(t lf t 2 ) = X(t lt t 2 ) -S( t x , t 2 ) . 



Although the optimizing procedure outlined above appears to 
be associated with an image reconstruction process, it may be 
implemented at any stage between the aforementioned 
compression and reconstruction. It is preferable to implement 
15 the optimizing process immediately after the initial 

compression so as to minimize the residual image. The 
preferred order has an advantage with regard to storage, 
transmission and the incorporation of subsequent image 
processes . 

20 C. Response Considerations 

The inverse eigenfilter in the conjugate domain is 
described as follows: 



-TVrjT ■ (51) 



where A(i,j) can be considered as an estimation of the 
25 frequency response of the combined decimation and 
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substantially reduce computation overhead associated with 
conjugate transform operations. Typically, such an 
improvement is given by the ratio of computation steps 
required to transform a set of N elements: 



15 



*log,(N) . 
FFT = 2 « _Llog 2 (N) , 

DFT N 1 2N 

which improves with the size of the image. 
A. The Compression Method 

The coding method is specified in the following steps: 

1. A suitable value of r (an integer) is chosen. The 
compression ratio is t* for two-dimensional images. 

2. Equation 23 is applied to find Y jl<J2 , which is the 
compressed data to be transmitted or stored: 



20 



Y 3 J = E XU^t,)^^.^) 



t,. C| — ' 



e,.t,-» 



= E E X(t »' t 2 ) F(t l"^I T ' C S"^2 T) 

B. The Reconstruct ion Method 

The reconstruction method is shown below in the 
following steps: 

1. Find the FFT" 1 of Y jl>J2 (the compressed data). 

2. The results of step 1 are divided by the 
eigenvalues X(f,m) set forth below. The 
eigenvalues X(l.m) are found by extending Equation 
4 8 to the two-dimensional case to obtain: 

\it,m) = a+/3(«l+u: , +u;+ui"+« l 1 ur + ul , u;) , (49) 
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J gain 


a+6b 


a+6b+12c 


a*6b 






+12c+18d 



The algorithms for compressing and reconstructing a still 
image are explained in the succeeding sections, 

5 III. OVERVIEW OF CODING -RECONSTRUCTION SCHEME 

A block diagram of the compression/reconstruction scheme 
is shown in FIG. 43. The signal source 1002-, which can have 
dimension up to N, is first passed through a low-pass filter 
(LPF) . This low-pass filter is implemented by convolving (in 
10 a process block 1014) a chosen spline filter 1013 with the 

input source 1002. For example, the normalized frequency 
response 1046 of a one-dimensional linear spline is shown in 
FIG. 44. Referring again to FIG. 43, it can be seen that 
immediately following the LPF, a subsampling procedure is 
15 used to reduce the signal size 1016 by a factor r. The 

information contained in the subsampled .source is not 
optimized in the least -mean-square sense. Thus, an 
optimization procedure is needed to obtain the best 
reconstructions weights. The optimization process can be 
20 divided into three consecutive parts. A DFT 1020 maps the 

non-optimized weights into the image conjugate domain. 
Thereafter, an inverse eigenf ilter process 1022 optimizes the 
compressed data. The frequency response plots for some 
typical eigenfilters and inverse eigenfilters are shown in 
25 FIG. 45 and 46. After the inverse eigenf ilter 1022, a DFT" 1 

process block 1024 maps its input back to the original image 
domain. When the optimized weights are derived, 

reconstruction can proceed. The reconstruction can be viewed 
as oversampling followed by a reconstruction low-pass filter. 
30 The embodiment of the optimized spline filter described 

above may employ a DFT and DFT" 1 type transform processes. 
However, those skilled in the art of digital image processing 
will appreciate that it is preferable to employ a Fast 
Fourier Transform (FFT) and FFT" 1 processes, which 
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conjugate of the DFT matrix, and A is defined to be the 
eigenmatrix of A, then: 

A = OAQ' ■ 



10 



15 



20 



The solution to y = Ax is then 

x = A' l y = QA'MQ'y) 



For the one-dimensional process 
eigenvalues of the transformation operators are: 

x( n =£3^*')' 

j'O 

= DFT(a } ) . 



described above , the 



(47) 



a , _q a and cj"=1 

where a 0 =a, a 1= /3 a n . 2 -o, a n . x p. 



Hence : 



(48) 



• „ «f t-he 1st -order tensor concept to 
A direct extension of tne ist <~>-<.^<= 

the 2nd-=rder tensor will be apparent to those skilled m the 
art By solving the 2nd-order tensor equations, the re ults 
Tre'extended to compress a 2-0 image. FIG. 42 deprcts three 
are extenoe £unct ions for 2-dimensioned image 

possib!e hexagonal ent funct. ^ exempU fies 

compression indices t=2, 3,*. * 

the relevant parameters for implementing the hexagonal tent 

functions : 
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This is proven easily, as shown below: 

n-l n-l 



/•0 ('.0 A V C ) 

n-l n-l 



t 



EE x(f) T ^ 7T ^a l ^r* -£rf* 



(43) 



4 . Solving 1st -Order Tensor Equations 
The solution of a lst-order tensor equation Y r =A„X is 
given by 

J$Y r = A^A rJ X, = 6 qt X, = X q , 



(44) 



so that 



= E 

t-0 



= DFT 



TUT 



(*0 



XT77 



n-l 



.0) 



-fJc 



(45) 



where DFT denotes the discrete Fourier Transform and DFT" 1 
10 denotes its inverse discrete Fourier Transform. 

An alternative view of the above solution method is 
derived below for one dimension using standard matrix 
methods. A linear transformation of a lst-order tensor can 
be represented by a matrix. For example, let A denote A*, in 
15 matrix form. If A,, is a circulant transformation, then A is 

also a circulant matrix. From matrix theory it is known that 
every circulant matrix is "similar" to a DFT matrix. If Q 
denotes the DFT matrix of dimension (nxn) , and Q* the complex 



-74- 



BNSDOCID: <WO 9602895A1JA> 



WO 96/02895 

it 



PCT/US95/08827 



15 



S'J evidently also satisfies the orthonormal property, 

it) (J) * - * (39) 



s 

i.e. 



where 6„ is the Kronecker delta function and • represents 

complex conjugation. j va dq 
A linear transformation is formed by summing the n dyads 

(t)AD* for t = o,l n-1 under the summation sign as 



r T s 
follows 



n_l (?) (£) * (40) 



10 Then 



j-0 
«-0 



(41) 



n-1 

= \(j)*» r (j> • 



Since A, has by a simple verification the same eigenvectors 
and eigenvalue! as the transformation A,, has in E^txon. s 
and 33, the transformation A*, and A,, are equal. 

3 . T — W r-Hnn nf isf-Or^r Tensors ,. 
The inverse transformation of A r . is shown next to be 

2^ 1 t f (42) 
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The tensor method for solving equations 8 and 20 is 
illustrated for the 1 -dimensional case below: 
Letting Pl c9 represent a circulant tensor of the form: 

A r, =a <,-r>tK>dn f or ( r , s=0 , 1 , 2 n-1) , (33) 

5 and considering the n special 1st -order tensors as 

W {g J s (u)') 5 for (f =0,1,2, ... f n-l) , (34) 

3 



where cj is the n-th root of unity, then 

A„W lt g ] = XIDW^J , (35) 



where 

10 

( t ) 

are the distinct eigenvalues of . The terms ' are 



orthogonal . 



W M W (J)* = i° for Uj (37) 
w s w s 1 n for t =j. 



At this point it is convenient to normalize these 
15 tensors as follows: 

<p {t J a-Lp/i* for (£-0,1,2, . . . ,n-l) . (39) 
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sequence coding, it is advantageous to utilize the tensor 
formalism during the course of the analysis in order to 
readily solve the linear systems in equations 8 and 20. 
Here, the tensor summation convention is used in the analysis 
5 for one and two dimensions. It will be appreciated that such 

convention may readily apply to the general case of N 
dimensions . 

1. T.j near Transforma Hnn of Tensors 

A linear transformation of a lst-order tensor is written 

10 as 

y r = A ra X B (sxwi on s) , ( 2B ) 



where is a linear transformation, and Y r ,X, are lst-order 
tensors. Similarly, a linear transformation of a second 
order tensor is written as: 

^r, - *r iW (sumons lt s 7 ). (29) 

15 The product or composition of linear transformations is 

defined as ffollpws. When the above Equation 29 holds, and 

z = B Y , (30) 



20 



then 



(31) 



Hence, 



r -PA (32) 



is the composition or product of two linear transformations. 

2 . rir-Hulant T ^ n.f nrtnation of ISt-OrdPf Tensors 
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depicted in FIG. 41b is described by the preferred case in 
which j3=Y=£ and tj=0. It will be appreciated that other 
orientations and shapes of the hexagonal tent are possible, 
as depicted, for example, in FIG. 41c, Combinations of 
5 hexagonal tents are also possible and embody specific 

preferable attributes. For example, a superposition of the 
hexagonal tents shown in FIG. 41b and 41c effectively 
"symmetrizes" the compression process. 

From Equation 25 above, Ajx32*i*2 can be expressed in 
10 circulant form by the following expression: 

where (k,-j,) n/ denote (kj, - j x ) mod n,, £=1,2, and 



9 i a t 

(27) 

y 0 0 - 0 0 
0 0 0-00 
3 0 0*0 0 



3 0 0-00 
3 0 0 - 0 0 



3 00 a 01 a 02 a «.n,-l 

a io a n a ia *" a i.n,-i 

a 20 a 21 a 22 ."* a 2,n,-I 

^-l.O a n,-l,l a n,-l,2 *" ^-l. ii,-! 



where (s 1 = 0,1,2,... n x -l) and (s 2 = 1, 2 , 3 , . . . , n 2 -l) . Note 
15 that when [a. lr . 2 ] is represented in matrix form, it is "block 

circulant . 11 

C. Compression-Reco nstruction Algorithms 

Because the objective is to apply the above-disclosed 
LMS error linear spline interpolation techniques to image 
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in FIG. 41a. The tensor transform in Equation 21 is treated 
in a similar fashion to obtain 



f\r,n,r 



t,.c, — r 

£ Fdi^+ij^kJ r.^ij^k,) TjFdvmj) 



T-l 



A a 



k-.-"* if { j rkj ) - o mod A (j 2 -k 2 ) ■ 0 mod n 2 

1-1 . 

""" , "* T * 1 if (j.-K) m ±1 mod n, A (j 2 -K) S 0 mod n, 

£ FCm^ni^T) Femora,) A ^ 

if (j.-Jc,) . 0 mod A {j 2 -k 2 ) = ±1 mod n, 

52 F(in 1 ±T,in 2 ±T)F(ni 1 ,in 2 ) A £ 

B ""'"" f * 1 if (jrk,) = ±1 mod n, A OV*,) - ±1 mod n, 

f-i . 

£ FtnijTT^tT) Fdt^,!^) AT? 

* if m T1 mod Di A (j - 2 -k 2 ) « ±1 mod n 2 . 



10 



(25) 

The values of a, (3, y. and f depend on r, and the shape 
and orientation of the hexagonal tent with respect to the 
image domain, where for example m, and m 2 represent row and 
column indices. For greater flexibility in tailoring the 
hexagonal tent function, it is possible to utilize all 
parameters of the [A^l • However, to minimize 

calculational overhead it is preferable to employ symmetric 
hexagons, disposed over the image domain with a bi- 
directional period r. Under these conditions, 0- 7 -« and r,-0. 
simplifying [A^l considerably. Specifically, the 
hexagonal tent depicted in FIG. 41a and having an orientation 
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and 

*3J. = Xit^t,)^^,^) . (22) 



With the visual aid of FIG. 41a, the tensor Y jlj2 reduces 
as follows: 

= "e t a )^ ii (t 1 ,t a ) 

= ^ X{t ir t^FU^r, t a -j a r) (23) 

= E E ^(t^t^Ftt^T, t 2 -j 2 T) . 

5 

Letting t^-j^sm* for k = l r 2, then 

Y W> = E X(m^j x r t m 2 ^j 2 r)F{m xf m 2 ) (24) 



for (j x = 0,1,...,^-!) and (j 2 = 0 , 1, . . . , n a -l) , where F(m 1/ m a ) 
is the doubly periodic, six-sided pyramidal function, shown 
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A condition for L to be a minimum is 



dL _ 



£ ( 



n,T.n,r 



k„k t -0 



= 2 



e,.e,-T 

n,-l,n,-l niT,n,T 



(19) 



The best coefficients X klkJ are the solution of the 2nd-order 
tensor equation, 

Ww ,r «. ' (20) 



where the summation is on k x and k 2/ 

E 

e 4 . t,— r 



(21) 
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resultant planar interpolation. In FIG. 40, X(t x ,t 2 ) is a 
doubly periodic array of image data (e.g., still image) of 
periods n x T and n 2 T, with respect to the integer variables t x 
and t 2 where r is a multiple of both t x and t 2 . The actual 
image 10 02 to be compressed can be viewed as being repeated 
periodically throughout the plane as shown in the FIG. 40. 
Each subimage of the extended picture is separated by a 
border 1032 (or gutter) of zero intensity of width r. This 
border is one of several possible preferred "boundary 
conditions" to achieve a doubly-periodic image. 

Consider now a doubly periodic planar spline, F(t x , t 2 ) 
which has the form of a six-sided pyramid or tent, centered 
at the origin and is repeated periodically with periods n r T 
and n 2 T with respect to integer variables t x and t 2 , 
respectively. A perspective view of such a planar spline 
function 1034 is shown in FIG. 41a and may hereinafter be 
referred to as "hexagonal tent." Following the one- 
dimensional case by analogy, letting: 



for 0^=0,1,...,^-!) and (k 2 =0 , 1 , . . . , n 2 -l) , the "best" 
weights X k k are found such that: 



(17) 




\2 



is a minimum. 
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Therefore, in Equations 14 and 15 has explicitly the 
following equivalent circulant matrix representations: 





A 00 A 0(1 




■^o.n-1 




























Ai-l,n-l_ 






t{ a u-:».P 














... a „ 


-1 






a i 


- a n 


-2 


4 


3 n-2 a n-i a 0 




a „- 3 






Ul 3 ' 


a 3 


- a e ^ 



2 0 o 
8 a ^ 
D 0 a 



0 
0 
0 



p 0 ••• a.. 



(16) 
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15 



One skilled in the art of matrix and filter analysis 
will appreciate that the periodic boundary conditions imposed 
on the data lie outside the window of observation and may be 
defined in a variety of ways. Nevertheless, periodic 
boundary conditions serve to simplify the process 
implementation by insuring that the correlation matrix [A,*] 
has a calculable inverse. Thus, the optimization process 
involves an inversion of [A,,) , of which the periodic boundary 
conditions and consequent circulant character play a 
preferred role. It is also recognized that for certain 
spline functions, symmetry rendered in the correlation matrix 
allows inversion in the absence of periodic image boundary 
conditions. 

B . T U n.THm ftnB io "»i compression by Planar gpUW 

For two-dimensional image data, multi-planar spline 
functions are combined to approximate the image data with a 
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The Yj's in Equation 12 represent the compressed data to 
be transmitted or stored. Note that this encoding scheme 
involves n correlation operations on only 2t-1 points. 

Since F(t) is assumed to be periodic with period nT, 
the matrix form of in Equation 9 can be reduced by 

substitution Equation 3 into Equation 9 to obtain: 



A. 



= £ F(m+(j-k)T) F(m) 



r-l 



£ (F(m)) 2 



a a if j-teO mod n 



(13) 



53 F{m±T)F{m) A (3 if j-km±l mod n 



0 otherwise 



By Equation 13, A^ can be expressed also in circulant form 
in the following manner: 




(14) 



where (k-j) n denotes (k-j) mod n, and 



a 0 = a, a x = 0, a 2 = 0, 



a, 



- P 



(15) 



.960269SA1JA> 
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This leads to the system, ^ ^ 



^ t— T 



for(-j=0,l,---< n " 



and ^ (10) 

m is reducible 
_ Y in Equation 10 i* r 
The term in a 



as 



foll° ws: 



^ X(t)Flt-jr) 



- E 



... = m, then-. 

Letting (tOT) _ (12) 



. . p.-i for (j=0,1.2 



10 
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(2) 

as shown by the functions * k (t) 1014 of FIG. 39. 

The family of shifted linear splines F ( t) is defined as 
follows : 

^(t) = F( t-kr) for (k=0, 1,2, ... , (n-l) ) . (3) 

5 One object of the present embodiment is to -approximate X(t) 

by the n- point sum: 

S(t) = J^X k * k {t) , (4) 

Jc-0 



in a least-mean-squares fashion where X Qt . . . are n 

reconstruction weights. Observe that the two-point sum in 
10 the interval 0<t<r is: 

X 0 ^ 0 (t) + X^ 1 (t)=X^l-I) + X 1 (l-IillI) 
=X 0+ (X 1 -X 0 )I 

Hence, S(t) 1030 in Equation 4 represents a linear 
interpolation of the original waveform X(t) 1002, as shown in 
FIG. 39. 

15 To find the "best" weights Xq,...^^, the quality 

L (X 0 ,X lr . . . rXn.J is minimized: 



where the sum has been taken over one period plus r of the 
data. X k is minimized by differentiating as follows: 

20 
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irar-M 1024 The advantages of this 

transformation (DFT ) 

tra f „„ oa the fast coding/reconstruction 

embodiment, in part, rely on tne 

speed, since only DFT and DFT" are the primary computations, 
Xre now the optimisation is a simple division. Greater 
elaboration into the principles of the method are provided in 
section II where also the presently contemplated preferred 
embodiments are derived as closed form solutions for a one- 
dimensional linear spline basis and two-dimensional planar 
spline bases. Section III Prides an operational 
description for the preferred method of compression and 
reconduction utilizing the optimal procedure disposed in 
section II. section IV discloses results of a reduction to 
practice of the preferred embodiments applied to one and 
two-dimensional images. Finally, Section V « 
preferred method of the filter optimizing process implemented 
in the image domain. 

rMAGE DATA COMPRESSION BY OPTIMAL SPLINE INTERPOLATION 

~ ,..n< lo r -' T— ™,r,r^ion by r»IS-Brr« 

Tii^ar splines 
For one-dimensional image data, bi-linear spline 

. whined to approximate the image data with a 
functions are combined to appi 

resultant linear interpolation, as shown m FIG 39. The 
resuitant closed-form approximating and optimizing process 
has a significant advantage in computational simplicity 

" ^Letting the decimation index r and image 

„ ,. T--1 "> and letting xitj 

r be fixed, positive integers T,t-1,2 ana a 

t be fixea, p per iod m, where n is also 

30 nr of the type, 



II. 

A. 



(1) 

Fit) = F( t+nr) , 



where 



-61- 



BNSDOCID: <WO 9602895A1_IA> 



WO 96/02895 



PCT/US95/0P27 



where A l) =^ 1 *^ j is a transformation matrix having elements 
representing the correlation between bases vectors ¥ A and ^ . 
The optimal weights Xk are determined by the inverse 
operation A* 1 : 

X k =A- x (X** k (x) ) , 



rendering compression with the least residue. One skilled in 
the art of LMS criteria will know how to express the 
processes given here in the geometry of multiple dimensions. 
Hence, the processes described herein are applicable to a 
10 variety of image data types. 

The present brief and general description has direct 
processing counterparts depicted in FIG. 36. The operation 

X°V k {x) 

represents a convolution filtering process 1014, and 

A- l (^^(x) ) 

15 represents the optimizing process 1018. 

In addition, as will be demonstrated in the following 
sections, the inverse operation A* 1 is equivalent to a so- 
called inverse eigenfilter when taken over to the conjugate 
image domain. Specifically, 

DFT x k - -J-DFT (X- * k {x) ) , 

20 

where DFT is the familiar discrete Fourier transform (DFT) 
and X ra are the eigenvalues of A. The equivalent optimization 
block 1018, shown in FIG. 38, comprises three steps: (1) a 
discrete Fourier transformation (DFT) 1020; (2) inverse 
25 eigenf iltering 1022; and (3) an inverse discrete Fourier 

-60- 



BNSOOCID: <WO 9602895A1JA> 



WO 96/02895 



PCT/US95/08827 



renders a small residual image. This residual image or 
residue AX is preferably retained for subsequent processing 
or reconstruction. In this respect there is a compromise 
between computational complexity. compression, and the 
magnitude of the residue. 

in a schematic view, a set of spline basis functions 
S' = {* k } may be regarded as a subset of vectors in the domain 
of possible image vectors S-U), as depicted in FIG. 37. The 
decomposition on projection of X onto components of S' is not 
unique and may be accomplished in a number of ways. A 
preferable criterion set forth in the present description is 
a least-mean- square (LMS) error, which minimizes the overall 
difference between the source image X and the reconstructed 
image XI . Geometrically, the residual image AX can be 
thought of as a minimal vector in the sense that it is the 
shortest possible vector connecting X to XI- That is, AX 
might, for instance, be orthogonal to the subspace S' , as 
shown in FIG. 37. As it will be elaborated in the next 
section, the projection of image vector X onto S' is 
approximated by an expression of the form: 

X=Xi - Y, **** ( 2> 



The "best" XI is determined by the constraint that AX=i"Xl is 
minimized with respect to variations in the weights Xj : 



which by analogy to FIG . 37, described an orthogonal 
projection of X onto S f . 

Generally, the above system of equations which 
determines the optimal Xk may be regarded as a linear 
transformation, which maps X onto S' optimally, represented 
here by: 
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set of basis functions that constitute shifted, but 
overlapping, spline functions {^ k (x)} such that 



where X' is the reconstructed image vector and x k is the 
5 decomposition weight. The image data vector X is thus 

approximated by an array of preferably . computationally 
simple, continuous functions, such as lines or planes, 
allowing also an efficient reconstruction of the original 
image . 

10 According to the method, the basis functions need not be 

orthogonal and are preferably chosen to overlap in order to 
provide a continuous approximation to image data, thereby 
rendering a non-diagonal basis correlation matrix: 

15 This property is exploited by the method of the present 

invention, since it allows the user to "adapt" the response 
of the filter by the nature and degree of cross -correlation. 
Furthermore, the basis of spline functions need not be 
complete in the sense of spanning the space of all image 

20 data, but preferably generates a close approximation to image 

X. It is known that the decomposition of image vector X into 
components of differing spline basis functions is not 

unique. The method herein disclosed optimizes the projection 
by adjusting the weights x* such that the differential 

25 variations of the average residue vanishes, 6<4X 2 >*0, or 

eguivalently <AX 2 >=min. In general, it will be expected that 
a more complete baisis set will provide a smaller residue and 
better compression, which, however, requires greater 
computational overhead and greater compression. Accordingly, 

30 it is preferable to utilize a computationally simple basis 

set, which is easy to manipulate in closed form and which 
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produces compressed data Y' , which is substantially free of 
aliasing prior to subsequent process steps . While the 
convolution or decimation filter 1014 attenuates aliasing 
effects, it does so by reducing the number of bits required 
5 to represent the signal. It is "low-pass" in nature, 

reducing the information content of the reconstructed image 
X' . Consequently, the residue AX 1012 will be larger, and in 
part, will offset the compression attained through 
decimation. 

10 The present invention disclosed herein solves this 

problem by providing a method of optimizing the compressed 
data such that the mean-square-residue <AX J > is minimized, 
where "< >" shall herein denote an averaging process. As 
shown in FIG . 36, compressed data Y' , generated in a manner 
15 similar to that shown in FIG. 35, is further processed by an 
optimization process 1018. Accordingly, the optimization 
process 1018 is dependent upon the properties of convolution 
filter 1014 and is constrained such that the variance of the 
mean-square-residue is zero, 5<AX 2 >-0. The disclosed method 
20 of filter optimization "matches" the filter response to the 

image data, thereby minimizing the residue. Since the 
decimation filter 1014 is low-pass in nature, the 
optimization process 1018, in part, compensates by 
effectively acting as a "self-tuned" high-pass filter. A 
25 brief descriptive overview of the optimization procedure is 
provided in the following sections. 

A. Tmaoe ApprrvxriTnatio n bv Spline Functions 
As will become clear in the following detailed 
description, the input decimation filter 1014 of FIG. 36 may 
be regarded as a projection of an image data vector X onto a 



30 
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elements than the signal energy of X, with some provisions 
regarding error. As depicted in FIG. 34, digital source 
image data 1002 represented by an appropriate N-dimensional 
array X is supplied to compression block 1004, whereupon 
5 image data X is transformed to compressed data Y' via a first 

generalized process represented here as G(X)=Y' . Compressed 
data may be stored or transmitted (process block 1006) to a 
"remote" reconstruction block 1008, whereupon a second 
generalized process, G'(Y')=X', operates to transform 

10 compressed data Y' into a reconstructed image X' . 

G and G' are not necessarily processes of mutual 
inversion, and the processes may not conserve the full 
information content of image data X. Consequently, X' will, 
in general, differ from X, and information is lost through 

15 the coding/reconstruction process. The residual image or so- 

called residue is generated by supplying compressed data Y' 
to a "local" reconstruction process 1005 followed by a 
difference process 1010 which computes the residue AX=X-X' 
1012. Preferably, X and X' are sufficiently close, so that 

20 the residue AX 1012 is small and may be transmitted," stored 

along with the compressed data Y' , or discarded. Subsequent 
to the remote -.reconstruction process 1008, the residue AX 
1012 and reconstructed image X' are supplied to adding 
process 1007 to generate a restored image X'+AX=X" 1003. 

25 In practice, to reduce computational overhead associated 

with large images during compression, a decimating or 
subsampling process may be performed to reduce the number of 
samples. Decimation is commonly characterized by a reduction 
factor t (tau) , which indicates a measure of image data 

30 elements to compressed data elements. However, one skilled 

in the art will appreciate that image data X must be filtered 
in conjunction with decimation to avoid aliasing. As shown 
in FIG. 35, a low-pass input filter may take the form of a 
pointwise convolution of image data X with a suitable 
35 convolution filter 1014, preferably implemented using a 

matrix filter kernel. A decimation process 1016 then 
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FIGs. 34-57 illustrate a preferred embodiment of the 
Reed Spline Filter 138 which is advantageously used for the 
first, second and third Reed Spline Filters 148, 225, and 
227. The Reed Spline Filter described in FIG. 34 - 57 is in 
terms of a generic image format. In particular the image 
input data comprises Y image input which corresponds for 
example to the red, green and blue image data in the first 
Reed Spline Filter 148 in the foregoing discussion. In like 
manner the outputs of the Reed Spline Filter 138 described as 
reconstruction values should be understood to correspond, for 
example, to the R_tau2 miniature 180, the G_tau2 miniature 
182 and the B_tau2 miniature 184 of the first Reed Spline 
Filter 138. 

The Reed Spline Filter is based on the a least-mean- 
square error (LMS) -error spline approach, which is extendable 
to N dimensions. One- and two-dimensional image data 
compression utilizing linear and planar splines, 
respectively, are shown to have compact, closed-form optimal 
solutions for convenient, effective compression. The 
computational efficiency of this new method is of special 
interest, because the compression/reconstruction algorithms 
proposed herein involve only the Fast Fourier Transform (FFT) 
and inverse FFT types of processors or other high-speed 
direct convolution algorithms. Thus, the compression and 
reconstruction from the compressed image can be extremely 
fast and realized in existing hardware and software. Even 
with this high computational efficiency, good image quality 
is obtained upon reconstruction. An important and practical 
consequence of the disclosed method is the convenience and 
versatility with which it is integrated into a variety of 
hybrid digital data compression systems. 
I. SPLINE FILTER OVERVIEW 

The basic process of digital image coding entails 
transforming a source image X into a "compressed" image Y 
such that the signal energy of Y is concentrated into fewer 
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of a DCT coefficient will reduce the VMSE. Since the bits 
added to the step value q are usually transformed into VMSE 
reduction, the MRT is generally negative. 

The MRT is calculated for all of the DCT coefficients. 
The adaptive method utilized by the optimized DCT 136 adjusts 
the quantization step values q of the quantized table Q 202 
by reducing the quantization step value q corresponding to 
the maximum MRT and increasing the quantization step value q 
corresponding to the minimum MRT. The optimized DCT 136 
repeats the process until the MRT is equalized across all of 
the DCT coefficients while holding the VMSE constant. 

FIG. 32 shows a flow chart of the process of creating an 
adaptive uniform DCT quantization table. In a step 700 the 
optimized DCT 13 6 computes the MRT values for all DCT 
coefficients i. In step 702 the optimized DCT 136 compares 
the MRT values, if the MRT values are the same, the optimized 
DCT 136 uses the resulting quantization table Q 202. If the 
MRT values are not equal, the optimized DCT 13 6 finds the 
minimum MRT value and the maximum MRT value for the DCT 
coefficients i in step 706. 

In step > 708, the optimized DCT 136 increases the 
quantization step value q lov corresponding to the minimum MRT 
value and decreases the quantization step value qn lgh 
associated with the maximum MRT value. Increasing q lw which 
reduces the number of bits devoted to the corresponding DCT 
coefficient but does not increase VMSE appreciably. 
Reducing the quantization step value q^^ increases the 
number of bits devoted to the corresponding dCT coefficient 
and reduces the VMSE significantly. The optimized DCT 13 6 
offsets the adjustments for the quantization step values q low 
and q^ in order to keep the VMSE constant. 

The optimized DCT 136 returns to step 700, where the 
process is repeated until all MRT values are equal. Once all 
of the quantization step values q are determined the 
resulting quantization table Q 202 is complete. 
The Reed Spline Filter 
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This provides a non-linear mapping of quantized 
coefficients onto reconstructed vaiues as follows: 

r t = CS t (qi) for i = 1, 2, 63 

In the adaptive uniform DCT quantization mode, the image 
5 the classifier 152 outputs the control script 196 that 

directs the optimized DCT 136 to adjust a given DCT uniform 
quantization table Q 202 to provide more efficient 
compression while holding the visual quality constant. This 
method adjusts the DCT quantization step sizes such that the 
10 compressed bit rate (entropy) after quantizing the DCT 

coefficients is minimized subject to the constraint that the 
visually-weighted mean squared error arising from the DCT 
quantization is held constant with respect to the base 
quantization table and the user-supplied quantization 

15 parameter L. 

The optimized DCT 136 uses marginal analysis to adjust 
the DCT quantization step sizes. A "marginal rate of 
transformation (MRT) " is computed for each DCT coefficient. 
The MRT represents the rate at which bits are "transformed" 
20 into (a reduction of) the visually weighted mean squared 

error (VMSE) . The MRT of a coefficient is defined as the 
ratio of 1) the marginal change in the encoded bit rate with 
respect to a quantization step value q to 2) the marginal 
change in the visual mean square error with respect to the 
25 quantization step value q. 

MRT (bits/VMSE) ratio is calculated as follows: 

MRT (bits/VMSE) = ( (Abits/ Aq) / (AVMSE/Aq) ) • 

3 0 Increasing the quantization step value q will add more 

bits to the representation of the corresponding DCT 
coefficient. However, adding more bits to the representation 
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Reconstruction is also linear unless reconstruction 
values have been computed and stored in the CS data segment 
204. Letting r denote the dequantized DCT coefficients, the 
linear dequantization formula is: 
5 for i m 0, l, \ . . , 63 

In the fixed- table DCT mode, the optimized DCT 136 can 
also compute the optimal reconstruction values stored in the 
CS data segment 204. While the DC term 201 is always 
10 calculated linearly, the CS reconstruction values represent 

the conditional expected value of each quantized level of 
each AC term 200. The CS reconstruction values are 
calculated for each AC term 200 by first calculating an 
absolute value frequency histogram, H A for the ith 
15 coefficient (for i = 1, 2, 63) over all DCT blocks in 

the source image, N, as follows: 
for j » 0 , 1 , . . , , N 
HiUchs frequency (abs(x tj ) = k) 

where x^ = the value of the ith coefficient in the 
20 jth DCT block. 

Second, the centroid of coefficient values is calculated 
between each quantization step. The formula for the centroid 
of the ith coefficient in the kth quantization interval is: 

cs^k)* £ 

25 

where 
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Optimized DCT 

The preferred embodiment of the optimized DCT 136 is 
illustrated in FIG. 9. More specifically, the optimized DCT 
136 uses the quantization table Q 202 to assign the DCT 
5 coefficients (DC terms 200 and AC terms 201) quantization 

step values. In addition, the quantization step values in 
the quantization table Q 202 vary depending on the optimized 
DCT 13 6 operation mode. The optimized DCT 136 operates in 
four DCT modes as follows: 1) switched fixed uniform DCT 
10 quantization tables that correspond to image classification, 

2) optimal reconstruction values, 3) adaptive uniform DCT 
quantization tables, and 4) adaptive non-uniform DCT 
quantization tables. 

The fixed DCT quantization tables are tuned to different 
15 image types, including eight standard tables corresponding to 

images differing along three dimensions: photographic versus 
graphic, small-scale versus large-scale, and high-activity 
versus low-activity. In the preferred embodiment, additional 
tables can be added to the resource file 160 (not shown) . 
20 The control script 196 defines which standard table the 

optimized DCT 136 uses in the fixed-table DCT mode. In the 
fixed-table mode, quantized step values for each DCT 
coefficient is obtained by linearly quantizing each x t DCT 
coefficient with the quantization value q A in quantization 
25 table Q. The mathematical relationship for the quantization 

procedure is : 

for i = 0 , 1 , . . . # 63 
if Xi >= 0, 

^ — 



3 0 if x t < 0, 
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computes the threshold value E Y , the threshold- value and 
the threshold value E x/ the enhancement analyzer 144 tests 
each 8x8 Y_tau2 block, each 4x4 U_tau4 block and each 
4x4 X_tau4 block (each block corresponds to a 16 x 16 block 
5 in the source image 100) as follows: 

Every pixel in the test block is convolved with 
the following filter masks: 

M t = {-1,-2,-1,0,0,0,1,2,1} 
M 2 = {1,0,-1,2,0,-2,1,0,-1} 
10 to compute two statistics S r and S 2 . 

Masks M x and M 2 are convolved with a three by three block 
of pixels centered on the pixel being tested. The three by 
three block of pixels is represented as: 

^12 X \2 

•^i ^2 -^aa 

X 31 X 22 X 22 

15 where the pixel x 22 is the pixel being tested. Thus the 

statistics are calculated with the following equations: 

5 1 « (-1-x^) - (2-x 12 ) - (l*x 13 ) + (l-x 31 ) + (2 - x 32 ) + (l-x 33 ) 

5 2 «-,(l-x u ) - (l-x 13 ) + (2-x 2x ) - (l-x 23 ) + (l*x 3x ) - (l-x 33 ) 

If S x plus S 2 is greater than the threshold value E Y for 
20 a particular 8x8 Y_tau2 block, the enhancement analyzer 144 

adds the 8x8 Y_tau2 block to the enhancement list 250. If 
S x plus S 2 is greater than the threshold value E 0 for a 
particular 4x4 U_tau4 block, the enhancement analyzer 144 
adds the 4x4 U_tau4 block to the enhancement list 250. If 
25 S x plus S 2 is greater than the threshold value E x for a 

particular 4x4 X_tau4 block the enhancement analyzer 144 
adds the 4x4 X_tau4 block to the enhancement list 250. 

In addition to the enhancement list 250, the enhancement 
analyzer 144 also uses the DCT coefficients 198 to identify 
3 0 visually unimportant "texture" regions where the compression 

ratio can be increased without significant loss to the image 
quality. 
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Referring to FIG. 31, in block 626 the image classifier 
constructs a classification map 628 based- upon -membership 
within the output sets. The classification map 628 
identifies independent regions in the source image 100 that 
are independently compressed. Thus the image classifier 152 
identifies the regions of the image that belong to compatible 
output sets 624. These are regions that contain relatively 
homogenous image contrast and call for one method or set or 
complementary methods to be applied to the entire region. 

In block 630 the image classifier 152 converts 
(defuzzif ies) , based on the def uzzif ication rule base 632, 
the membership of the fuzzy output sets 624 of each 
independent region in order to generate the control script 
196. The control script 196 contains instructions for which 
compression methods to perform and what parameters, tables, 
and optimization levels to employ for a particular region of 
the source image 100. 
The Enhancement Analyzer 

The preferred embodiment of the enhancement analyzer 144 
is illustrated in FIGs. 4, 15 and 30. More specifically, the 
enhancement analyzer 144 examines the Y_tau2 miniature 190, 
the U_tau2 miniature 192, and the X_tau4 miniature 228 to 
determine the ^enhancement priority of image blocks that 
correspond to 16 x 16 blocks in the original source image 
100. The enhancement analyzer 144 prioritizes the image 
blocks by 1) calculating the mean of the Y_tau2 miniature 
190, the U_tau2 miniature 192, and the X_tau4 miniature 228, 
and 2) testing every color block against a normalized 
threshold value E 252 for the Y_tau2 miniature 190, the 
U_tau2 miniature 192, and the X_tau4 miniature 228. A list 
of blocks that exceed the threshold value E 252 are added to 
the enhancement list 250. 

The enhancement analyzer 144 determines a threshold 
value E Y for the Y_tau2 miniature 190, a threshold value E v 
for the U_tau2 miniature 192, and a threshold value E t for 
the X_tau4 miniature 228. Once the enhancement analyzer 144 
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photographic input set is the complement of the -graphic input 
set . 

The color depth input set includes four classifications: 
gray scale images, 4-bit images, 8-bit images and 24-bit 

5 images. The color depth input corresponds to the input 

measurements 614 for the Y, U and X color components, A 
small dynamic range in the U and X color components indicates 
that the picture is likely to be a gray scale image, while 
gaps in the Y component histogram reveals whether the image 

10 was once a palettized 4-bit or 8-bit image.' 

The special feature input set corresponds to the input 
measurements 614 for the common or localized features that 
bear on the compressibility of the image. Thus the special 
feature input set identifies such artifacts as black borders 

15 caused by inaccurate scanning and graphical titling on a 

pho t ogr aphi c image . 

In block 622 the image classifier 152 maps the input 
sets 618 onto output sets 624 according to the output rule 
base 626 . The image classifier 152 applies the output rule 

20 base 626 to map each input set 618 onto membership of each 

fuzzy output set 624. The output sets 624 determine, for 
example, how many CS terms are stored in the CS data segment 
204 and the optimization of the VQ1 data segment 224, the VQ2 
data segment 258, the VQ3 data segment 242, the VQ4 data 

25 segment 244, and the number of VQ patterns to use. The 

output sets also determine whether the encoder 102 performs 
an optimized DCT 136 and which quantization tables Q 202 to 
apply . 

For the second Reed Spline Filter 225 and the third Reed 
30 Spline Filter 227, the output sets 624 adjust the decimation 

factor tau and the orientation of the kernal function. 
Finally, the output sets determine whether the channel 
encoder 168 utilizes a fixed Huffman encoder, and adaptive 
Huffman encoder or an LZ1. FIG. 33 illustrates several 
35 examples of mapping from input measurements 614 to input sets 

618 to output sets 624. 
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select a random sample of the plurality of blocks to use as 
the basis of the input measurements 614 . 

The image classifier 152 determines the set of input 
measurements 614 from the plurality of blocks using a variety 
5 iof methods. The image classifier 152 calculates the mean, 
the variance, and a histogram of all three color components. 
The image classifier 152 performs a discrete cosine transform 
of the image blocks to derive a set of DCT components wherein 
each DCT coefficient is histogrammed to provide a frequency 
10 domain profile of the i%)uted image. The image classifier 

152 performs special convolutions to gather information about 
edge content, texture content, and the efficacy of the Reed 
Spline Filter. The image classifier 152 derives spatial 
domain blocks and matches the spatial domain blocks with a 
15 special VQ-like pattern list to provide information about the 

types of activity contained in the picture. Finally, the 
image classifier scans the image for common and possibly 
localized features that bear on the compressibility of the 
image (such as typed text or scanning artifacts) . 

In block 616 the image classifier 152 analyzes the input 
measurements 614 generated in block 612 to determine the 
extent to which the source image 100 belongs to one of the 
fuzzy input set's 618 within the input rule base 620. The 
input rule base 620 identifies the list of image types. In 
25 the preferred embodiment, the image classifier 152 contains 

input sets 618 for the following image types: scale, text, 
graphics, photographic, color depth, degree of activity, and 
special features. 

Membership in the activity input set and the scale image 
3 0 input set are determined by the input measurements 614 for 

the DCT coefficient histogram, the spatial statistics, and 
the convolutions. Membership in the text image input set and 
the graphic input set correspond to the input measurements 
614 for a linear combination of high frequency DCT 
35 coefficients and gaps in the luminance histogram. The 



20 
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avoids the discrete switching from one compression method to 
another compression method. 

The fuzzy logic image classifier 152 receives the image 
data and determines a set of image measurements which are 
5 mapped onto one or more input sets. The image classifier 152 

in turn maps the input sets to corresponding output sets that 
identify which compression methods to apply. The output sets 
are then blended ( "defuzzif ied" ) to generate a control script 
196. The process of mapping the input image to a particular 

10 control script 196 thus requires three sets of rules: 1) 

rules for mapping input measurements onto input sets (e.g., 
degree of membership with the "high activity" input set - F[ 
average of AC coefficients 56-63]); 2) rules for mapping 
input sets onto output sets (e.g., if graphical image, use 

15 DCT quantization table 5 and 3) rules for defuzzif ication 

that mediate between membership of several output sets, i.e., 
how the membership of more than one output sets, should be 
blended to generate a single control script 196 that controls 
the compression process. 

20 Still further, the fuzzy logic rule base is easily 

maintained. The rules are modular. Thus, the rules, can be 
understood, researched, and modified independently of one 
another. In addition; the rule bases are easily modified 
allowing new rules to make the image classifier 152 more 

25 sensitive to different types of image content. Furthermore, 

the fuzzy logic rule base is extendable to include additional 
image types specified by the user or learned using neural 
network or genetic programming methods. 

FIG. 31 illustrates a block diagram of the image 

30 classifier 152. In block 612 the image classifier 152 

determines a set of input measurements 614 that correspond to 
the source image 100. In order to determine the input 
measurements 614, the image classifier 152 sub-divides the 
source image 100 into a plurality of blocks. To conserve 

35 computations, the user can enable the image classifier 152 to 
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matches appropriate compression tools to image content must 
reliably distinguish between content types that require 
different compression techniques, and must also be able to 
judge how to blend tools when types requiring different tools 
overlap. 

FIG- 3 0 illustrates the optimization of the compression 
process. The optimization process analyzes the input image 
600 at different levels. In the top level analysis 602 the 
image classifier 152 decomposes the image into a plurality of 
subimages 6 04 (regions) of relatively homogeneous content as 
defined by a classification map 606. The image classifier 
152 then outputs the control script 196 that specifies which 
compression methods or "tools" to employ in compressing each 
region. The compression methods are further optimized in the 
second level analysis 608 by the enhancement analyzer 144 
which determines which areas of an image are the most 
visually important (for example, text and strong luminance 
edges) . The compression methods are then further optimized 
in the third level analysis 610 with the optimized DCT 156, 
0 AVQ 134, and adaptive methods in the channel encoder 168. 

The second level analysis 608 and the third level analysis 
610 determine how to adapt parameters and tables to a 
particular image. 

The fuzzy logic image classifier 152 provides adaptive 
5 "intelligent" branching to appropriate compression methods 

with a high degree of computational simplicity . It is not 
feasible to provide the encoder 102 with an exhaustive 
mapping of all possible combinations of inherently non- 
linear, discontinuous, multidimensional inputs (image 
0 measurements) onto desired control scripts 196. The fuzzy 

logic image classifier 152 reduces such an analysis. 

Furthermore, the fuzzy logic image classifier 152 
ensures that the encoder 102 makes a smooth transition from 
one compression method (as defined by the control script 196) 
\S to another compression method. As image content becomes 

"more like" one class than another, the fuzzy controller 
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to decide, based on statistical characteristics of the image, 
what "tools" (combinations of compression methods) will best 
compress the image. 

The source image 100 may include a combination of 
5 different image types. For example, a photograph could show 

a person framed in a graphical border, wherein the person is 
wearing a shirt that contains printed text . In order to 
optimize the compression ratio for the regions of the image 
that contain different image types, the image classifier 152 
10 subdivides the source image 100 and then outputs the control 

script 196 that specifies the correct compression methods for 
each region. Thus, the image classifier 152 provides a 
customized, "most-efficient" compression ratio for multiple 
image types . 

15 The image classifier 152 uses fuzzy logic to infer the 

correct compression steps from the image content. Image 
content is inherently "fuzzy" and is not amenable to simple 
discrete classification. Images will thus tend to belong to 
several "classes." For example, a classification scheme 

20 might include one class for textual images and a second class 

for photographic images. Since an image may comprise a 
photograph of ~>a person wearing a shirt containing printed 
text, the image will belong to both classes to varying 
degrees. Likewise, the same image may be high contrast, 

25 "grainy," black and white and/or high activity. 

Fuzzy logic is a set- theoretic approach to 
classification of objects that assigns degrees of membership 
in a particular class. In classical set theory, an object 
either belongs to a set or it does not; membership is either 

30 100% or 0%. In fuzzy set theory, an object can be partly in 

one set and partly in another. The fuzziness is of greater 
significance when the content must be categorized for the 
purpose of applying appropriate compression techniques. 
Relevant categories in image compression include 
35 photographic, graphical, noisy, and high- energy. Clearly the 

boundaries of these sets are not sharp. A scheme that 
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The scaler 466 decimates the input data by dividing the 
source image into the desired number of output pixels and 
then radiometrically weights the input data to form the 
necessary output. FIG, 28 illustrates the scaler 466 with an 
5 input to output ratio of five-to-three in the one dimensional 

case. Input pixel P, 538, pixel P 2 540, pixel P 3 542, pixel 
P 4 544, and pixel P 5 546 contain different data values. The 
output pixel 548, pixel X 2 550, and pixel X 3 552 are 
computed as follows: 
10 X 1 = P x + <Pj) (0.67) 

X 2 = (P 2 ) (0.33) + P 3 + (P 4 > (0.33) 
X 3 = (P 4 ) (0.66) + P s 
The decimated data is then filtered with a 
reconstruction filter and an area average filter. The 
15 reconstruction filter interpolates the input data by 

replicating the pixel data. The area average filter then 
area averages by integrating the area covered by the output 
pixel . 

If the output ratio is less than 1 (i.e, interpolation 
20 is necessary) , the interpolator 462 utilizes bilinear 

interpolation. FIG. 2 9 illustrates the operation of the 
bilinear interpolation. Input pixel A 554, input pixel B 
556, input pixel C 558, and input pixel D 560, and reference 
point X 562 are interpolated to create output 564. For this 
25 example reference point X 562 is ct to the right of pixel A 

554 and 1-a to the right of pixel C 558, and reference point 
X 562 is Pdovm from pixel A 554 and 1-/3 up from pixel B 556. 
Reference point X 562 is stated formally as: 

X- (1-a) * ((1-/3)*A+/3*B) + Qf* { (1-/3) *C+/3*D) . 
30 The Imaae Classifier 

The preferred embodiment of the image classifier 152 is 
illustrated in FIG. 8. More specifically, the image 
classifier 152 uses fuzzy logic techniques to determine which 
compression methods will optimize the compression of various 
35 regions of the source image 100. The image classifier 152 

adds intelligence to the encoder 102 by providing the means 
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dimensional column inverse DCT. 

The equation for a one dimensional case is as follows: 
(ldout x are the elements of the one dimensional case) 



5 ldout 0 :=in 0 + (k x - inj + (k 2 • in 2 ) + (k 3 - in 3 ) 

ldout 1 :=in 0 + (k 4 - in t ) - (k 2 - in 2 ) - (k s - in 3 ) 
ldout 2 :=in 0 - (k 4 * in : ) - (k 2 - in 2 ) + (k s - in 3 ) 
ldout 3 :»in 0 - {k x - in x ) + (k 2 * in 2 ) - (k 3 -in 3 ) 

c(2)+c(6) k . c(3)--c(7) 

ft " ft 

c(5) +c(l) 

ft 



k im c(l) +c(3) ^.^ 
ft 

t c(5)+c(7) - _ 

ft 



10 

where c(k) is defined as in the 2x2 output matrix. 

The scaler 46 6 of the preferred embodiment is also shown 
in FIG. 27. More specifically, the scaler 466 utilizes a 
generalized routine that scales the image up or down while 

15 reducing aliasing and reconstruction noise. Scaling can be 

described as a combination of decimation and interpolation. 
The decimation step consists of downsampling and using an 
anti-aliasing filter; the interpolation step consists of 
pixel filling using a reconstruction filter for any scale 

20 factor that can be represented by a rational number P/Q, 

where P and Q are integers associated with the interpolation 
and decimation ratios, 
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All elements with i or j greater than 1, are set to zero. 
The setting of the high frequency index to zero is equivalent 
to filtering out the high frequency coefficients from the 
signal . 

5 Assigning Y as the 2x2 output matrix, the decimated 

output is thus equal to; 



Y 0(0 :=X 0 , 0+ (kv l*o,i)) + (K- <X lf0 )) + <k a - (Xi.i>> 
Yo.ii-Xo.o-tki- (X 0(1 )) + (k 1 - (X lf0 ))-<k 2 - (X 1(1 )) 
10 Y^-Xo.o+dCi- <X 0 .x> >-<)<! • (X 1<0 ))-(k 2 - (X 1(1 )) 

Y X(1 :-X 0 , 0 -(k 1 . (Xo.^J-Ckx- (x li0 ))+(k a - <X 1#1 >) 



where 

k^.*—* (c(l)+c(3)+c(5)+c(7) ) c(k)»cos^Tr J|j 

kj 2 

The creation of a 4 x 4 output matrix where a given X is 
15 an 8 x 8 input matrix that consists of DC terms 201 and AC 

terms 200 is stated formally as: 

All elements with i or j greater than 3 are set to zero. 

It is possible to implement the calculations in the 
2x2 case where the two dimensional equation is decomposed 
2 0 downward; however, performing the one dimensional approach 

twice reduces complexity and decreases the calculation time. 
In the preferred embodiment, the inverse DCT 476 computes an 
additional one -dimensional row inverse DCT, and then a one- 
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10 



The equation for an inverse DCT is: 
where 



u:=0. . 7 v:=0. . 7 
X: =0 . . 7 y : »0 ... 7 



C u :=-L- (u=0)+(u*0) 



The inverse DCT 476 generates an 8 x 8 output matrix 
that is decimated to a 4 x 4 matrix then to a 2 x 2 matrix, 
15 The inverse DCT 476 then decimates the output matrix by 

subsampling with a filter. After subsampling, an averaging 
filter smooths the output. Smoothing is accomplished by 
using a running average of the adjacent elements to form the 
output . 

2 0 For example, for a 4 x 4 output matrix the 8x8 matrix 

from the inverse DCT 476 is sub-divided into sixteen 2x2 
regions, and adjacent elements within each 2x2 region is 
averaged to fotm the output. Thus the sixteen regions form 
a 4 x 4 matrix output. 

25 For a 2 x 2 output matrix, the 8x8 matrix from the 

inverse DCT 476 is sub-divided into four 4x4 regions. The 
adj acent elements within each 4x4 matrix region are 
averaged to form the output. Thus, the four regions form a 
2x2 matrix output. 

30 In addition, since most of the AC coefficients are zero, 

the inverse DCT 476 is simplified by combining the inverse 
DCT equations with the averaging and the decimation 
equations. Thus, the creation of a 2 x 2 output matrix where 
a given X is an 8 x 8 input matrix that consists of DC terms 

35 201 and AC terms 200 is stated formally as: 
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The decoder 110 repeats the process illustrated in FIG. 27 
to generate a new full sized Y image 520, a new full sized U 
image 522, and a new full sized X image 530. The new full 
sized Y image 520 is added to the full sized Y image 
5 generated in the third step 444 . The new full sized U image 

522 is added to the full sized U image 522 generated in the 
third step 444. The new full sized X image 530 is added to 
the full sized X image generated in the third step 444. 

The inverse color converter 468 converts the full sized 
10 Y image 520, the full sized U image 522, and the full sized 

X image 53 0 into a full sized red, green, and blue image. 
The panel is then added to the displayed image. This process 
is completed for each panel until the entire enhanced image 
105 is expanded. 

15 The inverse DCT 476 of the preferred embodiment is a 

mathematical transformation for mapping data in the time (or 
spatial) domain to the frequency domain, based on the 
"cosine" kernel. The two dimensional version operates on a 
block of 8 x 8 elements. 
20 Referring to FIG. 9, the compressed DCT coefficients 198 

are stored as DC terms 201 and AC terms 200. In the 
preferred embodiment, the inverse DCT 476 as shown in FIGs. 
25 and 27 combines the process of transformation and 
decimation in the frequency and spatial domains (frequency 
25 and then spatial) into a single operation in the frequency 

domain. The inverse DCT 476 of the present invention 
provides at least a factor of 2 in implementation efficiency 
and is utilized by the decoder 110 to expand the thumbnail 
miniature 120 and splash image 122. 
30 The inverse DCT 476 receives a sequence of DC terms 201 

and AC terms 200 which are frequency coefficients. The high 
frequency terms are arbitrarily discarded at a predefined 
frequency to prevent aliasing. The discarding of the high 
frequency terms is equivalent to a low pass filter which 
35 passes everything below a predefine frequency while 

attenuating all the high frequencies to zero. 
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by a factor of four and the adder 456 adds the 4x4 rU_tau4 
residual blocks 526 to the interpolated Urn miniature 43 8 in 
order to create a Um+r miniature 528. The interpolator 462 
interpolates the Um+r miniature 52 8 by a factor of four to 
5 create the full sized U image 522. 

To calculate the full sized X image 53 0, the inverse 
Huffman encoder 458 expands the VQ4 data segment 244. The 
pattern matcher 524 uses the codebook indexes to retrieve the 
matching pattern blocks stored in the codebook 214 to expand 
10 the VQ4 data segment 244 into 4x4 rX_tau4 residual blocks. 

The decoder 110 then translates the 4x4 rX_tau4 residual 
blocks 532 into 4x4 rV_tau4 residual blocks 534. The 
interpolator 462 interpolates the Xm miniature 460 by a 
factor of four, and the adder 456 adds the 4x4 rV_tau4 
15 residual blocks 534 to the interpolated Xm miniature 460 in 

order to create a Xm+r miniature 536. The interpolator 462 
interpolates the Xm+r miniature 53 6 by a factor of four to 
create the full sized X image 530. 

The decoder stores the full sized Y image 520, the full 
20 sized U image 522, and the full sized X image 530 in local 

memory. The inverse color converter 468 then converts the 
full sized Y image 520, the full sized U image 522, and the 
full sized X image 530 into a full sized red, green, and blue 
image. The panel is then added to the displayed image. This 
25 process is completed for each panel until the entire source 
image 100 is expanded. 

In the forth step the decoder 110 receives the third 
image layer and builds upon the full sized Y image 520, the 
full sized U image 522, and the full sized X image 53 0 stored 
30 in local memory to generate the enhanced image 105. The 

third image data layer contains the remaining portion of the 
DCT data segment 208, the VQ1 data segment 224, the VQ2 data 
segment 258, the enhancement location data segment 510, the 
VQ3 data segment 242, and the VQ4 data segment 244 that 
35 correspond to the enhanced image 105. 
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from the inverse Huffman encoder 458 with the AC terms 200 
and the DC terms 201. The dequantizer 450 reverses the 
quantization process by multiplying the DCT quantized values 
206 with the quantization factors 478. The dequantizer 
5 obtains the correct quantization factors 478 from the 

quantization table Q 202. The dequantizer outputs 8x8 DCT 
coefficient blocks 482 to the inverse DCT 476. The inverse 
DCT 476 in turn, outputs the 8x8 DCT coefficient blocks 482 
that correspond to a Y image 509 that is l/4th the size of 
10 the original image. 

The pattern matcher 524 replaces the DCT residual blocks 
512 by finding an index to a matching pattern block in the 
codebook 214. The adder 456 adds the DCT residual blocks 512 
to the DCT coefficient blocks 482 on a pixel by pixel basis. 
15 The interpolator 462 interpolates the output of the adder 456 

by a factor of four to create a full size Y image 520. The 
interpolator 462 performs bilinear interpolation to enlarge 
the Y image 520 horizontally and vertically. 

The inverse Huffman encoder 458 decompresses the VQ2 
20 data segment 258 (the high resolution residual) and the 

enhancement location data segment 510. The pattern matcher 
524 uses the codebook indexes to retrieve the matching 
pattern blocks stored in the codebook 214 to expand the VQ2 
data segment 258 to create 16 x 16 high resolution residual 
25 blocks 514. An enhancement overlay builder 516 inserts the 

16 x 16 high resolution residual blocks into a Y image 
overlay 518 specified by the edge location data segment 510. 
The Y image overlay 518 is the size of the original image . 
The adder 456 adds the Y image overlay 518 to the full sized 
3 0 Y image 520. 

To calculate the full sized U image 522, the inverse 
Huffman encoder 458 expands the VQ3 data segment 242. The 
pattern matcher 524 uses the codebook indexes to retrieve the 
matching pattern blocks stored in the codebook 214 to expand 
35 the VQ3 data segment 242 into 4x4 rU_tau4 residual blocks 

526. The interpolator 462 interpolates the Urn miniature 438 
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the Xm replication factor 508 so that the replicated image is 
one-fourth of the display size. 

The inverse color converter 4 68 transforms the 
replicated image data into red, green and blue image data. 
5 The replicator 464 then again replicates the red, green, and 

blue image data to match the display size. The decoder 110 
displays the resulting splash image 122 on the display 112. 

FIG. 27 illustrates the third step 3 in which the 
decoder 110 generates the higher detail panels to expand the 

10 thumbnail miniature 120 into a standard image 124. FIG. 27 

also illustrates the fourth step 446 in which the decoder 110 
generates generate higher detail panels to enhance the detail 
of the standard image in order to create an enhanced image 
105 that corresponds to the source image 100. 

15 The decoding of the standard image 124 and the enhanced 

image 105 requires the inverse Huffman encoder 458, the 
combiner 452, the dequantizer 450, the inverse DCT 476, a 
pattern matcher 524, the adder 456, the interpolator 4 62, and 
an edge overlay builder 516. The decoder 110 adds additional 

2 0 detail to the displayed image as the decoder 110 receives new 

layers of compressed data. The additional layers include new 
panels of the DCT data segment 208 (containing the remaining 
AC terms 200'), the VQ1 data segment 224, the VQ2 data 
segment 258, the enhancement location data segment 510, the 

25 VQ3 data segment 242, and the VQ4 data segment 244. 

The decoder 110 builds upon the Ym miniature 436, the Urn 
miniature 438 and the Xm miniature 440 calculated for the 
thumbnail miniature 120 by expanding the next layer of image 
detail. The next layer contains a portion of the DCT data 

30 segment 208, the VQ1 data segment 224, the VQ2 data segment 

258, the enhancement location data segment 510, the VQ3 data 
segment 242, and the VQ4 data segment 244 that correspond to 
the standard image. 

The inverse Huffman encoder 458 decompresses the DCT 

35 data segment 208 and the VQ1 data segment 224 (the DCT 

residual) . The combiner 452 combines the DCT information 
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correspond to the Urn miniature 43 8 and the Xm miniature 440. 
The adder 456 translates the blocks corresponding to the Urn 
miniature 438 and the Xm miniature 440 into blocks that 
correspond to a Xm miniature 460. 
5 FIG. 26 illustrates the second step 442 in which the 

decoder 110 expands the Ym miniature 436, the Um miniature 
438, and the Xm miniature 460 that the decoder 110 further 
includes the interpolator 4 62 that operates on the Um 
miniature 436, the Um miniature 438 and the Xm miniature 4 60. 
10 The interpolator 462 is controlled by a Ym interpolation 
factor 484, a Um interpolation factor 486, and a Xm 
interpolation factor 496. A scaler 466 is controlled by a Ym 
scale factor 490, a Um scale factor 492, a Xm scale factor 
494. The decoder 110 further includes the replicator 464 and 
15 the inverse color converter. The interpolator 462 uses a 

linear interpolation process to enlarge the Ym miniature 436, 
the Um miniature 438, and the Xm miniature 460 by one, two or 
four times in both the horizontal and vertical directions. 

The Ym interpolation factor 484, the Um interpolation 
20 factor 486, and the Xm interpolation factor 488 control the 

amount of interpolation. The size of the source image 100 in 
the compressed file 104 is fixed, thus the decoder 110 may 
need to enlarge or reduce the expanded image before display. 
The decoder 110 sets the Ym interpolation factor 484 to a 
25 power of 2 (i.e., 1, 2, 4, etc.) in order to optimize the 
decoding process. However, in order to display an expanded 
image at the proper size, the scaler 466 scales the 
interpolated image to accommodate different display formats. 
The interpolator 462 also expands the Um miniature 438 
30 and the Xm miniature 440. Like the Ym interpolation factor 

484, the decoder 110 sets the Um interpolation factor 486 and 
the Xm interpolation factor 496 to a power of two. The 
decoder 110 sets the Ym interpolation factor 484, and the Um 
interpolation factor 486 so that the Um miniature 43 8 and Xm 
35 miniature 460 approximate the size of the interpolated and 

scaled Ym miniature 436. 
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layer of image data and generates the higher detail panels 
44 5 needed to expand the thumbnail miniature 120 into a 
standard image 124, a fourth step 446 the decoder 110 
receives a third layer of image data to generate higher 
5 detail panels to enhance the detail of the standard image in 

order to create an enhanced image 105 that corresponds to the 
source image 100. 

FIG. 25 illustrates the elements of the first step 434 
in which the decoder 110 expands the AC terms 200, the DC 
10 terms 201, the URCA data segment 246, and the XRCA data 

segment 248 into the Ym miniature 436, the Urn miniature 438, 
and Xm miniature 440. The first step 434 includes an inverse 
Huffman encoder 458, an inverse DPCM 476, a dequantizer 450, 
a combiner 452, an inverse DCT 476, a demultiplexer 454, and 
15 an adder 456. 

The decoder 110 then separates the DC terms 201 and the 
AC terms 200 from the URCA data segment 246 and the XRCA data 
segment 248. The inverse Huffman encoder 458 decompresses 
the first layer of the data stream 118 which includes the AC 
20 terms 200, the URCA data segment 246, and the XRCA data 

segment 248. The inverse DPCM 476 further expands the DC 
terms 201 to output DC terms 201' . The dequantizer 450 
further expands the AC terms 200 to output AC terms 200' by 
multiplying the output AC terms 200' with the quantization 
25 factors 478 in the quantization table Q 202 to output 8x8 

DCT coefficient blocks 482. The quantization table Q 202 is 
stored in the CS data segment 204 (not shown) . 

The combiner 452 combines the output DC terms 201' with 
the 8x8 DCT coefficient blocks 482. The decoder 110 sets 
3 0 the inverse DCT factor 4 80, and the inverse DCT 476 outputs 

the DCT coefficient blocks 482 that correspond to the Ym 
miniature 436 that is l/256th the size of the original image. 

The demultiplexer 454 separates the inverse Huffman 
encoded URCA data segment 246 from the XRCA data segment 248. 
3 5 The inverse DPCM 4 76 then expands the URCA data segment 246 

and the XRCA data segment 24 8 to generate the blocks that 
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The plurality of data segments 166 in the first layer 
are also interleaved panel -by-panel to allow the thumbnail 
miniature 120 and splash image 122 to be decoded a panel at 
a time. The second layer contains the remaining plurality of 
5 data segments 166 needed to expand the compressed file 104 

into the final image. The plurality of data segments 166 in 
the second layer are also interleaved panel -by- pane 1 . 

Block 432 in FIG. 22d shows the compressed file format 
of the thumbnail image 12 0, the splash image 122, the layered 

10 standard image 124, and the sharp image 125. The thumbnail 

miniature 120 and splash image 122 are arranged in the first 
layer as described above. The remaining data segments 166 
are layered at different quality levels. The mult i- layering 
is accomplished by layering and interleaving panel 

15 information associated with the VQ2 data segment 258 (high 

resolution residual) . The multiple layers allow the display 
of all the panels at a particular level of detail before 
decoding the panels in the next layer. 
The Decker 

2 0 FIG. 23 illustrates the decoder 110 of the present 

invention. The ''decoder 110 takes as input the compressed 
data stream 118 and expands or decodes it into an image for 
viewing on the display 112. As explained above, the 
compressed file 104 and the transmitted data stream 118 
25 include image components that are layered with a plurality of 

panels 433. The decoder 110 expands the plurality of panels 
433 one at a time. 

As illustrated in FIG. 24, the decoder 110 expands the 
compressed file 104 in four steps. In a first step 434, the 

3 0 decoder 110 expands the first layer of image data in the 

compressed file 104 or the data stream 118 into a Ym 
miniature 436, a Urn miniature 438, and an Xm miniature 440. 
In a second step 442, the decoder 110 uses the Ym miniature 
436, the Urn miniature 438, and an Xm miniature 440 to 
35 generate the thumbnail miniature 120, and the splash image 

122. In a third step 444, the decoder 110 receives a second 
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interleaved based on the user defined playback model 261 as 
follows: 1) as a single-pass, non-panellized image (FIG. 
22a), 2) as a single-pass, panellized image (FIG. 22b), 3) as 
two layers comprising the thumbnail miniature 120, and the 
5 sharp image 125 (FIG. 22c) and 4) as multiple layers 

comprising the thumbnail miniature 120, the standard image 
124, and the sharp image 125 (FIG. 22d) . 

Block diagram 426 in FIG. 22a shows the compressed file 
format for the single-pass, non-panellized image. The 
10 compressed file 104 begins with the header, the optional 

color palette and the resource data such as the tables and 
Huffman encoding information. The plurality of data segments 
16 6 are not interleaved or layered. Thus, the decoder 110 
must receive the entire compressed file 104 before any part 
15 of the source image 100 can be displayed. 

Block diagram 428 in FIG. 22b shows the compressed file 
104 for the single -pass, panellized image. The plurality of 
data segments 166 are interleaved panel -by-panel , so that all 
of the segments for each panel are contiguously transmitted. 
20 The decoder 110 can expand and display a panel at a time 

until the entire 'compressed file 104 is expanded. 

Block diagram 430 in FIG. 22c shows the compressed file 
format of the thumbnail miniature 120, the splash image 122 
and the final or sharp image 125. The plurality of data 
25 segments 166 are interleaved panel -by-panel and the 

resolution components for the thumbnail miniature 12 0 and 
splash image 122 exist in the first layer, the panels for the 
final image exist in the second layer. The first layer 
includes selected portions of the plurality of data segments 
3 0 166 that are needed to decode the panels of the thumbnail 

miniature 120 and splash image 122. Thus, the compressed 
file 104 only stores the low detail color components (URCA 
data segment 246, the XRCA data segment 248), the DC terms 
201 and as many as the first five AC terms 200 in the first 
35 layer. The number of AC terms 200 depends on the user- 

selected quality of the thumbnail miniature 120. 
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encoded by the new methods. Thus, to remain compatible with 
prior encoders 102, the decoder 110 needs to identify which 
encoder 102 generated the compressed data. In the preferred 
embodiment, byte 7 420 identifies the encoder 102 and byte 2 
5 410, byte 3 412, byte 4 414, byte 5 416, and byte 6 418 are 

reserved for future enhancements to the encoder 102. 

FIG. 21 illustrates the normal segment 4 02 as a sequence 
of bytes that are logically separated into two sections: an 
identifier section 422 and a data section 424, The 

10 identifier section 422 precedes the data section 424. The 

identifier section 422 specifies the size of the normal 
segment 402, and identifies a segment type. The data section 
424 contains information about the source image 100. 

The identification section 422 is a sequence of one, 

15 two, or three bytes that identifies the length of the normal 

segment 4 02 and the segment type. The segment type is an 
integer number that specifies the method of data encoding. 
The compressed file 104 contains 256 possible segment types. 
The data in the normal segment 402 is formatted according to 

20 the segment type. In the preferred embodiment, the normal 

segments 4 02 are- optimally formatted for the color palette, 
the Huffman bitstreams, the Huffman tables, the image panels, 
the codebook information, the vector dequantization tables, 
etc. 

25 For example, the file format of the preferred embodiment 

allows the use of different Huffman bitstreams such as an 
8-bit Huffman stream, a 10-bit Huffman stream, and a DCT 
Huffman stream. The encoder 102 uses each Huffman bitstream 
to optimize the compressed file 104 in response to different 

30 image types. The identification section 422 identifies which 

Huffman encoder was used and the normal segment 402 contains 
the compressed data. 

FIGs. 22a, 22b, 22c, and 22d illustrate the layering and 
interleaving of the plurality of data segments 166 in the 

35 compressed file 104 of the preferred embodiment. The 

plurality of data segments 166 in the compressed file 104 are 
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122 and the standard image 124 before the entire compressed 
file 104 is transferred. As the decoder 110 receives each 
successive layer of components, the decoder 110 adds 
additional detail to the displayed image. 

In addition to layering the compressed data, the 
segmented architecture allows the decoder 110 of the ' 
preferred embodiment: 1) to move from one segment to the next 
in the stream without fully decoding segments of data, 2) to 
skip parts of the data stream 118 that contain data that is 
unnecessary for a given rendition of the image, 3) to ignore 
parts of the data stream 118 that are in an unknown format, 

4) to process the data in an order that is configurable on 
the fly if the entire data stream 118 is stored locally, and 

5) to store different layers of the compressed file 104 
separately from one another. 

As shown in FIG. 20a, the byte arrangement of the data 
stream 118 and the compressed file 104 includes a header 
segment 400 and a normal segment 402. The header segment 400 
contains header information, and the normal segment 402 
contains data. The header segment 4 00 is the first segment 
in the compressed file 104 and is the first segment 
transmitted with the data stream 118. In the preferred 
embodiment, the header segment 400 is eight bytes long. 

As shown in FIG. 20b, the byte arrangement of the header 
segment 400 includes a byte 0 406 and a byte 1 408 of the 
header segment 400. Byte 0 406 and byte 1 408 of the header 
segment 400 identify the data stream 118. Byte 1 408 also 
indicates if the data stream 118 contains image data 
(indicated by a n G") or if it contains resource data 
(indicated by a "C") . Resource data includes color lookup 
tables, font information, and vector quantization tables. 

Byte 2 410, byte 3 412, byte 4 414, byte 5 416, byte 6 
418 and byte 7 420 of the header segment 400 specify which 
encoder 102 created the data stream 118. As new encoding 
methods are added to the encoder 102, new versions of the 
encoder 102 will be sold and distributed to decode the data 
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Also, if the accumulated squared error for a- particular 
codebook pattern is less than a pre -determined threshold, the 
codebook pattern is immediately accepted and the AVQ 134 
quits testing other codebook patterns. Furthermore, the 
codebook patterns in the present invention are ordered 
according to the frequency of matches. Thus, the AVQ 134 
begins by comparing the input block with patterns in the 
codebook 214 that are most likely to match. Still further, 
the codebook patterns are grouped by the sum of their squared 
amplitudes. Thus the AVQ 134 selects a group of similar 
codebook patterns by summing the squared amplitude of an 
input block in order to determine which group of codebook 
patterns to search. 

Besides improving the time it takes for the AVQ 134 to 
find an optimal codebook patt ern, the AVQ 134 includes a set 
of codebooks 214 that are adapted to the input blocks (i.e., 
codebooks 214 that are optimized for input blocks that 
contain DCT residual values, high resolution residual values, 
etc.). Finally, the AVQ 134 of the preferred embodiment, 
adapts a codebook 214 to the source image 100 by devising a 
set of new patterns to add to a codebook 214 . 

Therefore, the AVQ 134 of the preferred embodiment has 
three modes of operation: 1) the AVQ 134 uses a specified 
codebook 214, 2) the AVQ 134 selects the best-fit codebook 
214, or 3) the AVQ 134 uses a combination of existing 
codebooks 214, and new patterns that the AVQ 134 creates. If 
the AVQ 134 creates new patterns, the AVQ 134 stores the new 
patterns in the VQCB data segment 223 . 
Th3 Cpmprs393g Fjje ?QCT^t 

FIGs. 20a and 20b illustrate the segmented architecture 
of the data stream 118 that results from transmitting the 
compressed file 104. The segmented architecture of the 
compressed file 104 in the preferred embodiment allows 
layering of the compressed image data. Referring to FIG. 2, 
the layering of the compressed file 104 allows the decoder 
110 to display the thumbnail miniature 120, the splash image 
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which pattern in the codebook 214 has -the minimum- squared 
error (also referred to as the minimum error) . The error 
term is the mean square error produced by subtracting the 
pattern element P ik from the input block element X t , squaring 
the result and dividing by sixteen (16) . 

The process of searching for a matching pattern in the 
codebook 214 is time-consuming. The AVQ 134 of the preferred 
embodiment accelerates the pattern matching process with a 
variety of techniques. 

First, in order to find the optimal codebook pattern, 
the AVQ 134 compares each input block term X 4 to the 
corresponding term in the codebook pattern P, being tested 
and calculates the total squared error for the first codebook 
pattern. This value is stored as the initial minimum error. 

For each of the other patterns P, - Pj.P, Ph. the AVQ 134 

subtracts the X, and P„ terms and squares the result. The 
AVQ 134 compares the resulting squared error to the minimum 
error. If the squared error value is less than the minimum 
error, the AVQ 134 continues with the next input term X, and 
computes the squared error associated with X 2 and P„ . The 
AVQ 134 adds the result to the squared error of the first two 
terms. The AVQ. 134 then compares the accumulated squared 
error for X, and X 2 to the minimum error. If the accumulated 
squared error is less than the minimum error the squared 
error calculation continues until the AVQ 134 has evaluated 
all 16 terms. 

If at any time in the comparison, the accumulated 
squared error for the new pattern is greater than the minimum 
squared error, the current pattern is immediately rejected 
and the AVQ 134 discontinues calculating the squared error 
for the remaining input block terms for that pattern. If the 
total squared error for the new pattern is less than the 
minimum error, the AVQ 134 replaces the minimum error with 
the squared error from the new pattern before making the 
3 5 comparisons for the remaining patterns. 



25 
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uses adaptive tables, 3) a conventional LZ1 coding technique 
or 4) a run-length encoding process. The channel encoder 168 
chooses the optimal compression method based on the image 
type identified in the control script 196. 
The Adaptive Vector Quantizer 

The preferred embodiment of the AVQ 134 is illustrated 
in FIG. 19. More specifically, the AVQ 134 optimizes the 
vector quantization techniques described above. The AVQ 134 
sub-divides the image data into a set of 4 x 4 pixel blocks 
216. The 4x4 pixel blocks 216 include sixteen (16) 
elements X i; X 2 , X 3 . . ,X l6 218, that start at the upper left-hand 
corner and move left to right on every row to the bottom 
right-hand corner. 

The codebook 214 of the present invention comprises M 
predetermined sixteen-element vectors, P x , P 2 , P 3 , . . . , p M 220, 
that correspond to common patterns found in the population of 
images. The indexes I 1# I a , I 3 , . . . , I M 222 refer respectively to 
the patterns P x , P 2 , P 3 , . . . , P M 220. 

Finding a best- fit pattern from the codebook 214 
requires comparing each input block with every pattern in the 
codebook 214 and selecting the index that corresponds to the 
pattern with the minimum squared error summed over the 16 
elements in the 4x4 block. The optimal code, C, for an 
input vector, X, is the index j such that pattern Pj 
satisfies : 



15 

E 



(X i -P ij ) : 



16 



min 

= P fc €P 



16 



where: X t is the ith element of the input vector, X 
and P ik is the ith element of the VQ pattern 

The comparison equation finds the best match by 
selecting the minimum error term that results from comparing 
the input block with the codebook patterns. In other words, 
the AVQ 134 calculates the mean squared error term associated 
with each pattern in the codebook 214 in order to determine 
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4x4 residual block by assigning an index that identifies 
the corresponding block pattern in the codebook. Once 
complete, the AVQ 134 outputs the compressed high resolution 
residual to the VQ2 data segment 258. 

FIG. 17 illustrates a block diagram of the palette 
selector 164. The palette selector 164 computes a "best-fit" 
24 -bit color palette 260 for the decoder 110. The palette 
selector 164 is optional and is user defined. The palette 
selector 164 computes the color palette 260 from the Y_tau2 
miniature 190, the U_tau2 miniature 192 and the X_tau2 
miniature 194. The user can select a number of palette 
entries N 262 to range from 0 to 255 entries. If the user 
selects a zero, no palette is computed. If enabled, the 
palette selector 164 adds the color palette 260 to a 
15 plurality of data segments 166. 

The channel encoder 168, as shown in FIG. 18, 
interleaves and channel encodes the plurality of data 
segments 166. Based on the user defined playback model 261, 
the plurality of data segments 166 are interleaved as 
20 follows: 1) as a single layer, single-pass comprising the 

entire image, 2) as two layers comprising the thumbnail 
miniature 120 and the remainder of the image 122 with 
enhancement information interleaved into each data block 
(panel) in the second layer, and 3) as multiple layers 
25 comprising the thumbnail miniature 120, the standard image 

124, the sharp image 105, and additional layers as specified 
by the user. For each playback model an option exists to 
interleave the data for panellized or non-panellized display. 
The user defined playback model 261 is described in more 

3 0 detail below. 

After interleaving the plurality of data segments 166, 
the channel encoder 168 compresses the plurality of data 
segments 166 in response to the control script 196. In the 
preferred embodiment, the channel encoder 16 B compresses the 

35 plurality of data segments 166 with: 1) a Huffman encoding 
process that uses fixed tables, 2) a Huffman process that 
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threshold value E 252 determines how much enhancement 
information the encoder 102 adds to the compressed file 104. 
Thus, setting the threshold value E 252 to zero will suppress 
any image enhancement information. 
5 If the result of convolving a particular 16 x 16 high 

resolution block is greater than the threshold value E 252, 
the 16 x 16 high- resolution block is prioritized and added to 
the enhancement list 250. Thus the enhancement list 250 
identifies which 16 x 16 blocks are coded and prioritizes how 

10 the 16 x 16 coded blocks are listed. 

The high resolution residual calculator 162 , as shown in 
FIG. 16, determines the high resolution residual for each 
16 x 16 high resolution block identified in the enhancement 
list 250. The high resolution residual calculator 162 

15 translates the VQ1 data segment 224 from the AVQ 134 into a 

reconstructed rY_tau2 residual 212 by mapping the indexes in 
the VQ1 data segment 224 to the patterns in the codebook. 
The reconstructed rY_tau2 residual is added to the dY_tau2 
miniature 254 (dequantized DCT components) . The result is 

20 interpolated by a factor of two in the vertical and 

horizontal dimensions and is subtracted from the original 
Y_tau2 190 miniature to form the high resolution residual. 

The high resolution residual calculator 162 then 
extracts high resolution 16 x 16 blocks from the high 

25 resolution residual according to the priorities in the 

enhancement list 250. As will be explained in more detail 
below, the high resolution residual calculator. 162 outputs 
the highest priority blocks in the first enhancement layer, 
the next -highest priority blocks in the second enhancement 

30 layer, etc. The high resolution residual blocks are referred 

to as the xr_Y residual 256. 

The xr_Y residual 256 is further compressed with the AVQ 
134. The AVQ 134 subdivides the xr_Y residual 256 into 4x4 
residual blocks. The residual blocks are compared with 

3 5 blocks in the codebook. If a residual block corresponds to 

a block pattern in the codebook, the AVQ 134 compresses the 

-24- 

BNSDOCIO: <WO 9602895A1_!A> 



PCT/US95/08827 

WO 96/02895 



10 



referred to as a dU_tau4 miniature 234. The interpolated 
X_taul6 miniature 232 is referred to as a dX_tau4 miniature 
236. The dU_tau4 miniature 234 and dX_tau4 miniature 236 are 
subtracted from the actual U_tau4 miniature 226 and X_tau4 
miniature 228 to create an rU_tau4 residual 238 and an 

r X_tau4 residual 240. 

As illustrated in FIG. 11, the rU_tau4 residual 238 and 
the rX_tau4 residual 240 are further compressed with the AVQ 
134. The AVQ 134 subdivides the rU_tau4 residual 238 and the 
rX_tau4 residual 240 into 4 x 4 residual blocks. The 
residual blocks are compared with blocks in the set of 
codebooks 214 to find the codebook patterns that minimize the 
squared error. The AVQ 134 compresses the residual block by 
assigning an index that identifies the corresponding block 
15 pattern in the set of codebooks 214. Once complete, the AVQ 
134 outputs the compressed residual as the VQ3 data segment 
242 and the VQ4 data segment 244. 

The U_taul6 miniature 230 and the X_taul6 miniature 232 
are also compressed with the DPCM 140 as shown in FIG. 14. 
The DPCM 140 outputs the low-detail color components as the 
URCA data segment 246 and the XRCA data segment 248. The 
URCA data segment 246 and the XRCA data segment 248 form the 
low-detail color components that the decoder 110 uses to 
create the color thumbnail miniature 120 if this is included 
25 as a playback option in the compressed data stream 118. 

FIG. 15 illustrates the enhancement analyzer 144 of the 
preferred embodiment. The Y_tau2 miniature 190, the U_tau4 
miniature 226, and the X_tau4 miniature 228 are analyzed to 
determine an enhancement list 250 that specifies the visual 
30 priority of every 16 x 16 image block. The enhancement 
analyzer 144 determines the visual priority of each 16 x 16 
image block by convolving the Y_tau2 miniature 190, the 
U_tau4 miniature 226, and the X_tau4 miniature 228 and 
comparing the result of the convolution to a threshold value 
35 E 252. The threshold value E 252 is user defined. The user 

can set the threshold value E 252 from zero to 200. The 
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indexes to the VQ1 data segment 224. Thus/ the VQl data 
segment 224 is a list of codebook indexes that identify block 
patterns in the codebook. As explained in more detail below, 
the AVQ 134 of the preferred embodiment also generates new 
5 codebook patterns that the AVQ 134 outputs to the set of 
codebooks 214. The added codebook patterns are stored in the 
VQCB data segment 223, 

FIG.' 12 illustrates a block diagram of the second Reed 
Spline Filter 225 and third Reed Spline Filter 227. Once the 
10 image classifier 152 determines the particular image type, 

the U_tau2 miniature 192 and the X_tau2 miniature 194 are 
further decimated and filtered, by the second Reed Spline 
Filter 225. Like the first Reed Spline Filter 148 shown in 
FIG. 6, the second Reed Spline Filter 225 compresses the 
15 U_tau2 miniature 192 and the X_tau2 miniature 194 in a two- 

step process. First, the U_tau2 miniature 192 and the X_tau2 
miniature 194 are vertically and horizontally decimated by a 
factor of two. The decimated data are then spline fitted to 
determine optimal reconstruction weights that will minimize 
20 the mean square error of the reconstructed decimated 

miniatures. Once complete, the second Reed Spline Filter 225 
outputs the optimal reconstruction values to create a U_tau4 
miniature 226 and an X_ tau4 miniature 228. 

The third Reed Spline Filter 227 decimates the U_tau4 
25 miniature 226 and the X_tau4 miniature 228 vertically and 

horizontally by a factor of four. The decimated image data 
are again spline fitted to create a U_taul6 miniature 23 0 and 
an X_taul6 miniature 232. 

In FIG. 13 the Reed Spline residual calculator 158 
3 0 preserves the image information lost by the second Reed 

Spline Filter 225 and the third Reed Spline Filter 227 by 
computing and compressing the Reed Spline Filter residual. 
The Reed Spline residual calculator 158 -reconstructs the 
U_tau4 miniature 226 and X_tau4 miniature 228 by 
35 interpolating the U_taul6 miniature 230 and the X_taul6 

miniature 232. The interpolated U_taul6 miniature 230 is 
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residual calculator 154 then reconstructs the dequantized DCT 
components with an inverse DCT 210 to generate a 
reconstructed dY_tau2 miniature 211 . The reconstructed 
dY_tau2 miniature 211 is subtracted from the original Y_tau2 
miniature 190 to create an rY_tau2 residual 212. 

Referring to FIG. 11, it can be seen that the rY_tau2 
residual 212 is further compressed with the AVQ 134. The 
technique of vector quantization is used to represent a block 
of information as a single index that requires fewer bits of 
storage. As explained in more detail below, the AVQ 134 
maintains a group of commonly occurring block patterns in a 
set of codebooks 214 stored in the resource file 160. The 
index references a particular block pattern within a 
particular codebook 214. The AVQ 134 compares the input 
15 block with the block patterns in the set of codebooks 214. 

If a block pattern in the set of codebooks 214 matches or 
closely approximates the input block, the AVQ 134 replaces 
the input block pattern with the index. 

Thus, the AVQ 134 compresses the input block information 
20 into a list of indexes. The indexes are decompressed by 

replacing each . index with the block pattern each index 
references in the set of codebooks 214. The decoder 110, as 
explained in more detail below, also has a set of the 
codebooks 214. During the decoding process the decoder 110 
25 uses the list of indexes to reference block patterns stored 

in a particular codebook 214. The original source cannot be 
precisely recovered from the compressed representation since 
the indexed patterns in the codebook will not match the input 
block exactly. The degree of loss will depend on how well 
30 the codebook matches the input block. 

As shown in FIG. 11, the AVQ 134 compresses the rY_tau2 
residual 212, by sub-dividing the rY_tau2 residual 212 into 
4x4 residual blocks and comparing the residual blocks with 
codebook patterns as explained above. The AVQ 134 replaces 
3 5 the residual blocks with the codebook indexes that minimize 

the squared error. The AVQ 134 outputs the list of codebook 
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152 sub-divides the source image 100 into distinct regions. 
The image classifier 152 then outputs the control script 196 
that specifies the correct compression methods for each 
region. The control script 196 specifies which compression 

5 methods to apply in the third stage 130, and specifies the 

channel encoding methods to apply in the fourth stage 132. 

As shown in FIG. 4, during the third stage 130, the 
encoder 102 uses the control script 196 to select the optimal 
compression methods from its compression toolbox. The 

10 encoder 102 separates the Y data 186 from the U and X data 

188, Thus, the encoder 102 separates the Y_tau2 miniature 
190 from the U_tau2 miniature 192 and the X_tau2 miniature 
194, and passes the Y_tau2 miniature 190 to the optimized DCT 
13 6, and passes the U_tau2 miniature 192 and the X_tau2 

15 miniature 194 to a second and third Reed Spline Filter 156. 

As illustrated in FIG. 9, the optimized DCT 13 6 
subdivides the Y_tau2 miniature 190 into a set of 8 x 8 pixel 
blocks and transforms each 8x8 pixel block into sixty- four 
DCT coefficients 198 . The DCT coefficients include the AC 

20 terms 200 and the DC terms 201. The DCT coefficients =198 are 

analyzed by the optimized DCT 136 to determine optimal 
quantization step., sizes and reconstruction values. The 
optimized DCT 136 stores the optimal quantization step sizes 
(uniform or non-uniform) in a quantization table Q 202 and 

25 outputs the reconstruction values to the CS data segment 204. 

The optimized DCT 136 then quantizes the DCT coefficients 198 
according to the quantization table Q 202, Once quantized, 
the optimized DCT 136 outputs the DCT quantized values 206 to 
the DCT data segment 208. 

3 0 In order to preserve the image information lost by the 

optimized DCT 136, the DCT residual calculator 154 (shown in 
FIG. 10) computes and compresses the DCT residual. The DCT 
residual calculator 154 dequantizes in a dequantizer 209 the 
DCT quantized values 206 stored in the DCT data segment 208 

35 by multiplying the reconstruction values in the CS data 

segment 204 with the DCT quantized values 206. The DCT 
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two components are related to the chrominance U and X data 
188. The color space converter 150 transforms the RGB to the 
YUX color space according to the following formulas: 
Y = 0.29900R + 0.58700G + 0.11400B 
5 U = 0.16870R + 0.33120G + 0.50000B 

X = 0.50000R - 1.08216G + 0.918S9B 
Referring to FIG. 6, it can be seen that a R_tau2 
miniature 180 corresponds to a miniature that is decimated 
and. spline fitted by a factor of 2. A G_tau2 miniature 182 
10 corresponds to a green miniature that is decimated and spline 

fitted by a factor of 2. A B_tau2 miniature 184 corresponds 
to a blue miniature that is decimated and spline fitted by a 
factor of 2. 

FIG. 7 illustrates the color space converter 150 of FIG. 
15 4 _ Tne color space converter 150 transforms the R_tau2 

miniature 180, the G_tau2 miniature 182 and the B_tau2 
miniature 184 output by the first Reed Spline Filter 148 into 
a different color coordinate system in which one component is 
the luminance Y data 186 and the other two components are 
20 related to the chrominance U and X data 188 as shown in FIG. 

4. Thus the color space converter 150 transforms the R_tau2 
miniature 180, the G_tau2 miniature 182 and the B_tau2 
miniature 184 into a Y_tau2 miniature 190, a U_tau2 miniature 
192 and an X_tau2 miniature 194. 
25 Referring to FIG. 8, it can be seen that the second 

stage 128 of the encoder 102 includes an image classifier 152 
that determines the image type by analyzing the Y_tau2 
miniature 190, the U_tau2 miniature 192 and the X_tau2 
miniature 194. The image classifier 152 uses a fuzzy logic 
30 rule base to classify an image into one or more of its known 

classes. In the preferred embodiment, these classes include 
gray scale, graphics, text, photographs, high activity and 
low activity images. The image classifier 152 also 
decomposes the source image 100 into block units and 
35 classifies each block. Since the source image 100 includes 

a combination of different image types, the image classifier 
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"tau." The R_tau2' decimated data 174 corresponds to the red 
component decimated by a factor of 2. The G_tau2' decimated 
data 176 corresponds to the green component decimated by a 
factor of 2. The B_tau2' decimated data 178 corresponds to 
the blue component decimated by a factor of 2 . 

In the spline fitting step in block 172, the first Reed 
Spline Filter 14 8 partially restores the source image detail 
lost by the decimation in block 170. The spline fitting step 
in block 172 processes the R_tau2' decimated data 172, the 
G_tau2' decimated data, and the B_tau2' decimated data to 
calculate optimal reconstruction weights. 

As explained in more detail below, the decoder 110 will 
interpolate the decimated data into a full sized image. In 
this interpolation, the decoder 110 uses the reconstruction 
weights which have been calculated by the Reed Spline Filter 
in such a way as to minimize the mean squared error between 
the original image components and the interpolated image 
components. Accordingly the Reed Spline Filter 14 8 causes 
the interpolated image to match the original image more 
closely and increases the overall sharpness of the 
interpolated picture. In addition, reducing the error 
arising from the- decimation step in block 170 reduces the 
amount of data needed to represent the residual image . The 
residual image is the difference between the reconstructed 
image and the original image. 

The reconstruction weights output from the Reed Spline 
Filter 14 8 form a "miniature" of the original source image 
100 for each primary color of red, green, and blue, wherein 
each red, green, and blue miniature is one-quarter the 
resolution of the original source image 100 when a tau of 2 
is used. 

More specifically, the preferred color space converter 
150 transforms the R_tau2 miniature 180, the G_tau2 miniature 
182 and the B_tau2 miniature 184 output by the first Reed 
Spline Filter 148 into a different color coordinate system in 
which one component is the luminance Y data 18 6 and the other 
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at least one of the following: an adaptive vector quantizer 
(AVQ 134), an optimized discrete cosine transform (optimized 
DCT 136), a Reed Spline Filter 138 (RSF) , a differential 
pulse code modulator (DPCM 140) , a run length encoder (RLE 
142), and an enhancement analyzer 144. 

FIG. 4 illustrates a more detailed block diagram of the 
encoder 102. The first stage 126 of the encoder 102 includes 
a formatter 146, a first Reed Spline Filter 148 and a color 
space converter 150 which produces Y data 18.6, and U and X 
data 1B8. The second stage 128 includes an image classifier 
152. The third stage includes an optimized discrete cosine 
transform and adaptive DCT quantization (optimized DCT 136) , 
a DCT residual calculator 154, the adaptive vector quantizer 
(AVQ 134), a second and a third Reed Spline Filter 156, a 
Reed Spline residual calculator 158, the differential pulse 
code modulator (DPCM 140), a resource file 160, the 
enhancement analyzer 144, a high resolution residual 
calculator 162, and a palette selector 164. The fourth stage 
includes a plurality of data segments 166 and a channel 
encoder 168. The output of the channel encoder 168 is stored 
in the compressed file 104 . 

The formatter 146, as shown in more detail in FIG. 5, 
converts the source image 100 from its native format to a 24- 
bit red, green and blue pixel array. For example, if the 
source image 100 is an 8-bit palletized image, the formatter 
converts the 8-bit palletized image to a 24-bit red, green, 
and blue equivalent. 

The first Reed Spline Filter 148, illustrated in more 
detail in FIG. 6, uses a two-step process to compress the 
formatted source image 100. The two-step process comprises 
a decimation step performed in block 170 and a spline fitting 
step performed in a block 172. As explained in more detail 
below, the decimation step in the block 170 decimates each 
color component of red, green, and blue by a factor of two 
along the vertical and horizontal dimensions using a Reed 
Spline decimation kemal . The decimation factor is called 
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Referring to FIG. 2, it can be seen that the layering of 
the compressed file 104 allows the decoder 110 to display a 
thumbnail image and progressively improving quality versions 
of the source image 100 before the decoder 110 receives the 
5 entire compressed file 104. The first data expanded by the 

decoder 110 can be viewed as a thumbnail miniature 120 of the 
original image or as a coarse quality "splash" image 122 with 
the same dimensions as the original image. The splash image 
122 is a result of interpolating the thumbnail miniature to 

10 the dimensions of the original image. As the decoder 110 

continues to receive data from the data stream 118, the 
decoder 110 creates a standard image 124 by decoding the 
second layer of information and adding it to the splash image 
122 data to create a higher quality image. The encoder 102 

15 can create a user-specified number of layers in which each 

layer is decoded and added to the displayed image as data is 
received. Upon receiving the entire compressed file 104 via 
the data stream 118, the decoder 110 displays an enhanced 
image 105 that is the highest quality reconstructed image 

20 that can be obtained from the compressed data stream 118. 

FIG. 3 illustrates a block diagram of the encoder 102 
constructed in accordance with the present invention. The 
encoder 102 compresses the source image 100 in four main 
stages. In a first stage 126, the source image 100 is 

25 formatted, processed by a Reed Spline Filter and color 

converted. In a second stage 128, the encoder 102 classifies 
the source image 100 in blocks. In a third stage 130, the 
encoder 102 selectively applies particular encoding methods 
that optimize the compression ratio. Finally, the compressed 

30 data are interleaved and channel encoded in a fourth stage 

132. 

The encoder 102 contains a library of encoding methods 
that are treated as a toolbox. The toolbox allows the 
encoder 102 to selectively apply particular encoding methods 
35 that optimize the compression ratio for a particular image 

type. In the preferred embodiment, the encoder 102 includes 
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As discussed in more detail below, the encoder 102 uses 
decimation, filtering, mathematical transforms, and 
quantization techniques to concentrate the image into fewer 
data samples representing the image with fewer bits per pixel 
5 than the original format. Once the source image 100 is 

compressed with the encoder 102, the set of compressed data 
are assembled in the compressed file 104. The compressed 
file 104 is stored in the first storage device 106 or 
transmitted to another location via the data channel 108. If 
10 the compressed file 104 is transmitted to another location, 

the data stored in the compressed file 104 is transmitted 
sequentially via the data channel 108. The sequence of bits 
in the compressed file 104 that are transmitted via the data 
channel 108 is referred to as a data stream 118. 

The decoder 110 expands the compressed file 104 to the 
original source image size. During the process of decoding 
the compressed file 104, the decoder 110 displays the 
expanded source image 100 on the display 112. In addition, 
the decoder 110 may store the expanded compressed file 104 in 
the second storage device 114 or print the expanded 
compressed file 104 on the printer 116. 

For example, if the source image 100 comprises a 
640 x 480, 24-bit color image, the amount of memory needed to 
store and display the source image 100 is approximately 
25 922,000 bytes. In the preferred embodiment, the encoder 102 

computes the highest compression ratio for a given decoding 
quality and playback model. The playback model allows a user 
to select the decoding mode as is discussed in more detail 
below. The compressed data are then assembled in the 
compressed file 104 for transmittal via the data channel 108 
or stored in the first storage device 106. For example, at 
a 92-to-l compression ratio, the 922,000 bytes that represent 
the source image 100 are compressed into approximately 10,000 
bytes. In addition, the encoder 102 arranges the compressed 
35 data into layers in the compressed file 104. 
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FIG. 51 is plots of standard error for representative 
images 1 and 2; 

FIG, 52 is a compressed two- miniature using the 
optimized decomposition weights; 

FIG. 53 is a block diagram of a preferred adaptive 
compression scheme in which the method of the present 
invention is particularly suited; 

FIG. 54 is a block diagram showing a combined sublevel 
and optimal-spline compression arrangement; 

FIG. 55 is a block diagram showing a combined sublevel 
and optimal -spline reconstruction arrangement; 

FIG. 56 is a block diagram showing a multi-resolution 
optimized interpolation arrangement; and 

FIG. 57 is a block diagram showing an embodiment of the 
optimizing process in the image domain. 

Detailed Description of the Invention 
FIG. 1 illustrates a block diagram of an image 
compression system that includes a source image 100, an 
encoder 102, a compressed file 104, a first storage device 
106, a communication data channel 108, a decoder 110 , a 
display 112, a second storage device 114, and a printer 116. 
The source image 100 is represented as a two-dimensional 
image array of picture elements, or pixels. The number of 
pixels determines the resolution of the source image 100, 
which is typically measured by the number of horizontal and 
vertical pixels contained in the two-dimensional image array. 

Each pixel is assigned a number of bits that represent 
the intensity level of the three primary colors: red, green, 
and blue. In the preferred embodiment, the full-color source 
image 100 is represented with 24 bits, where 8 bits are 
assigned to each primary color. Thus, the total storage 
required for an uncompressed image is computed as the number 
of pixels in the image times the number of bits used to 
represent each pixel (referred to as bits per pixel) . 
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FIG. 33 illustrates a table of several examples showing 
the mapping from input measurements to input sets to output 
sets; 

FIG. 34 is a block diagram of image data compression; 
5 FIG. 35 is a block diagram of a spline 

decimation/interpolation filter; 

FIG. 36 is a block diagram of an optimal spline filter; 
FIG. 37 is a vector representation of the image, 
processed image, and residual image; 
10 pic . 38 is a block diagram showing a basic optimization 

block of the present invention; 

FIG. 39 is a graphical illustration of a one -dimensional 

bi-linear spline projection; 

FIG. 4 0 is a schematic view showing periodic replication 

15 of a two-dimensional image; 

FIGs. 4la, 41b and 41c are perspective and plan views of 
a two-dimensional planar spline basis; 

FIG. 42 is a diagram showing representations of the 

hexagonal tent function; 
20 FIG . 43 is a flow diagram of compression and 

reconstruction of image data; 

FIG. 44 is-. a graphical representation of a normalized 
frequency response of a one-dimensional bi-linear spline 
basis ; 

FIG. 45 is a graphical representation of a one- 
dimensional eigenfilter frequency response; 

FIG. 46 is a perspective view of a two-dimensional 
eigenfilter frequency response; 

FIG. 47 is a plot of standard error as a function of 
30 frequency for a one -dimensional cosinusoidal image; 

FIG. 48 is a plot of original and reconstructed one- 
dimensional images and a plot of standard error; 

FIG. 49 is a first two-dimensional image reconstruction 
for different compression factors; 

FIG. 50 is a second two-dimensional image reconstruction 
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for different compression factors; 
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FIG. 19 is a block diagram of the vector quantization 
process; 

FIGs. 20a and 20b show the segmented architecture of the 
data stream; 

5 FIG. 21 illustrates the normal segment; 

FIG. 22a , 22b, 22c and 22d illustrate the layering and 
interleaving of the plurality of data segments; 

FIG. 23 is a block diagram of the decoder of the present 
invention; 

10 FIG. 24 illustrates the multi-step decoding process and 

includes a Ym miniature, a Urn miniature, an Xm miniature, the 
thumbnail miniature, the splash image and the standard image, 
and the enhanced image; 

FIG. 25 is a block diagram of the decoder and includes 
15 an inverse Huffman encoder, an inverse DPCM, a dequantizer, 

a combiner, an inverse DCT, a demultiplexer, and an adder; 

FIG. 2 6 is a block diagram of the decoder and includes 
the interpolator, interpolation factors, a scaler, scale 
factors, a replicator, and an inverse color converter; 
20 FIG. 27 is a block diagram of the decoder that includes 

the inverse Huffman encoder, the combiner, the dequantizer, 
the inverse DCT, - a pattern matcher, the adder, the 
interpolator, and an enhancement overlay builder; 

FIG. 28 is block diagram of the scaler with an input to 
25 output ratio of five- to- three in the one dimensional case; 

FIG. 29 illustrates the process of bilinear 
interpolation; 

FIG. 3 0 is a block diagram of the process of optimizing 
the compression methods with the image classifier, the 
3 0 enhancement analyzer, the optimized DCT, the AVQ, and the 

channel encoder; 

FIG. 31 is a block diagram of the image classifier; 
FIG. 32 is a flow chart of the process of creating an 
adaptive uniform DCT quantization table; 
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image, a splash image, a panellized standard image, and the 
final representation of the source image; 

FIG. 3 is a block diagram of the encoder showing the 
four stages of the encoding process; 
5 FIG. 4 is a block diagram of the encoder showing a first 

Reed Spline Filter, a color space conversion transform, a Y 
miniature, a U miniature, an X miniature, an image 
classifier, an optimized discrete cosine transform, a 
discrete cosine transform residual calculator, an adaptive 
10 vector quantizer, a second and third Reed Spline Filter, a 

Reed Spline residual calculator, a differential pulse coder 
modulator, an enhancement analyzer, a high resolution 
residual calculator, a palette selector, a plurality of data 
segments and a channel encoder; 
15 FIG. 5 is a block diagram of the image formatter; 

FIG. 6 is a block diagram of the Reed Spline Filter; 
FIG. 7 is a block diagram of the color space conversion 
transform; 

FIG. 8 is a block diagram of the image classifier; 
20 FIG. 9 is a block diagram of the optimized discrete 

cosine transform; 

FIG. 10 is a block diagram of the DCT residual 
calculator; 

FIG. 11 is a block diagram of the adaptive vector 
25 quantizer; 

FIG. 12 is a block diagram of the second and third Reed 
Spline Filters; 

FIG, 13 is a block diagram of the Reed Spline residual 
calculator; 

30 FIG. 14 is a block diagram of the differential pulse 

code modulator; 

FIG. 15 is a block diagram of the enhancement analyzer; 
FIG. 16 is a block diagram of the high resolution 
residual calculator; 
35 FIG. 17 is the block diagram of the palette selector; 

FIG. 18 is the block diagram of the channel encoder; 
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and further compressed with an optimized discrete cosine 
transform and an adaptive vector quantizer. The second and 
third components are further compressed with a second and 
third Reed Spline Filter, the adaptive vector quantizer, and 
5 a differential pulse code modulator. 

The enhancement analyzer enhances areas of an image 
determined to be the most visually important, such as text or 
edges. The enhancement analyzer determines the visual 
priority of pixel blocks. The pixel block dimensions 
10 typically correspond to 16 x 16 pixel blocks in the source 

image. In addition, the enhancement analyzer prioritizes 
each pixel block so that the most important enhancement 
information is placed in the earliest enhancement layers so 
that it can be decoded first. The output of the enhancement 
15 analyzer is compressed with the adaptive vector quantizer. 

A user may set the encoder to compute a color palette 
optimized to the color image. The color palette is combined 
with the output of the discrete cosine transform, the 
adaptive vector quantizer, the differential pulse code 
20 modulator, and the enhancement analyzer to create a plurality 

of data segments . The channel encoder then interleaves and 
compresses the plurality of data segments. 

Brief Description of the Drawings 
These and other aspects, advantages, and novel features 
25 of the invention will become apparent upon reading the 

following detailed description and upon reference to 
accompanying drawings in which: 

FIG. 1 is a block diagram of an image compression system 
that encodes, transfers and decodes an image and includes a 
30 source image, an encoder, a compressed file, a first storage 

device, a data channel, a data stream, a decoder, a display, 
a second storage device, and a printer; 

FIG. 2 illustrates the multi-step decoding process and 
includes the source image, the encoder, the compressed file, 
35 the data channel, the data stream, the decoder, a thumbnail 
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system to adapt to new image types and to combine compressed 
image data with sound, text and video. 

Like the encoder, the decoder of the present invention 
includes a toolbox of decoding methods. The decoding process 
can begin with the decoder first determining the encoding 
methods used to encode each data segment . The decoder 
determines the encoding methods from instructions the encoder 
inserts into the compressed data file. 

Adding decoder instructions to the compressed image data 
provides several advantages . A decoder that recognizes the 
instructions can decode files from a variety of different 
encoders, accommodate content -sensitive encoding methods, and 
adjust to user specific needs. The decoder of the present 
invention also skips parts of the data stream that contain 
data that are unnecessary for a given rendition of the image, 
or ignore parts of the data stream that are in an unknown 
format. The. ability to ignore unknown formats allows future 
file layers to be added while maintaining compatibility with 
older decoders. 

In a preferred embodiment of the present invention, the 
encoder compresses an image using a first Reed Spline Filter, 
an image classifier, a discrete cosine transform, a second 
and third Reed Spline Filter, a differential pulse code 
modulator, an enhancement analyzer, and an adaptive vector 
quantizer to generate a plurality of data segments that 
contain the compressed image. The plurality of data segments 
are further compressed with a channel encoder. 

The Reed Spline Filter includes a color space conversion 
transform, a decimation step and a least mean squared error 
(LMSE) spline fitting step. The output of the first Reed 
Spline Filter is then analyzed to determine an image type for 
optimal compression. The first Reed Spline Filter outputs 
three components which are analyzed by the image classifier. 
The image classifier uses fuzzy logic techniques to classify 
the image type. Once the image type is determined, the first 
component is separated from the second and third components 
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supplemented, not discarded, so that the image is built layer 
by layer. Thus a single compressed file with a layered file 
format can store both a thumbnail and a full size version of 
the image and can store the full size version at various 
5 quality levels without storing any redundant information. 

The layered approach of the present invention allows the 
transmission or decoding of only the part of the compressed 
file which is necessary to display a desired image quality. 
Thus, a single compressed file can generate a thumbnail and 

10 different quality full size images without the need to 

recompress the file to a smaller size and lesser quality, or 
store multiple files compressed to different file sizes and 
quality levels. 

This feature is particularly advantageous for on line 

15 service applications, such as shopping or other applications 

where the user or the application developer may want several 
thumbnail images downloaded and presented before the user 
chooses to receive the entire full size, high quality image. 
In addition to conserving the time and transmission costs 

20 associated with viewing a variety of high quality images that 

may not be of interest, the user need only subsequently 
download the remainder of each image file to view the higher 
detail versions of the image. 

The layered format also allows the storage of different 

25 layers of the compressed data file separate from one another. 

Thus, the core image data (miniature) can be stored locally 
(e.g., in fast RAM memory for fast access), and the higher 
quality "enhancement" layers can be stored remotely in lower 
cost bulk storage. 

30 A further feature of the layered file format of the 

present invention allows the addition of other compressed 
data information. The layered and segmented file format is 
extendable so that new layers of compressed information such 
as sound, text and video can be added to the compressed image 

35 data file. The extendable file format allows the compression 
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compression ratio for a particular image component. The 
toolbox approach allows the encoder to support many different 
encoding methods in one program, and accommodates the 
invention of new encoding methods without invalidating 
existing decoders. The toolbox approach thus allows 
upgradeability for future improvements in compression methods 
and adaptation to new technologies. 

A further feature of the present invention is that the 
encoder creates a file format that segments or "layers" the 
compressed image. The layering of the compressed image 
allows the decoder to display image file segments, beginning 
with the data at the front of the file, in a coherent 
sequence which begins with the decoding and display of the 
information that constitutes the core of the image as defined 
15 by human perception. This core information can appear as a 
good quality miniature of the image and/or as a full sized 
"splash" or coarse quality version of the image. Both the 
miniature and splash image enable the user to view the 
essence of an image from a relatively small amount of encoded 
data. In applications where the image file is being 
transmitted over a data channel, such as a telephone line or 
limited bandwidth wireless channel, display of the miniature 
and/or splash image occurs as soon as the first segment or 
layer of the file is received. This allows users to view the 
25 image quickly and to see detail being added to the image as 

subsequent layers are received, decoded, and added to the 
core image . 

The decoder decompresses the miniature and the full 
sized splash quality image from the same information. User 
30 specified preferences and the application determine whether 

the miniature and/or the full sized splash quality image are 
displayed for any given image. 

Whether the first layer is displayed as a miniature or 
a splash quality full size image, the receipt of each 
35 successive layer allows the decoder to add additional image 

detail and sharpness. Information from the previous layer is 
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The exact improvement over JPEG will depend on image content, 
resolution, and other factors. 

Smaller image files translate into direct storage and 
transmission time savings. In addition, the present 
5 invention reduces the number of operations to encode and 

decode an image when compared to JPEG and other compression 
methods of a similar nature. Reducing the number of 
operations reduces the amount of time and computing resources 
needed to encode and decode an image, and thus improves 
10 computer system response times. 

Furthermore, the image compression system of the present 
invention optimizes the encoding process to accommodate 
different image types. As explained below, the present 
invention uses fuzzy logic techniques to automatically 
15 analyze and decompose a source image, classify its 

components, select the optimal compression method for each 
component, and determine the optimal content -sensitive 
parameters of the selected compression methods. The encoder 
does not need prior information regarding the type of image 
20 or information regarding which compression methods to apply. 

Thus, a user does not need to provide compression system 
customization or need to set the parameters of the 
compression methods. 

The present invention is designed with the goal of 
25 providing an image compression system that reliably 

compresses any type of image with the highest achievable 
efficiency, while maintaining a consistent range of viewing 
qualities. Automating the system's adaptivity to varied 
image types allows for a minimum of human intervention in the 
encoding process and results in a system where the 
compression and decompression process are virtually 
transparent to the users. 

The encoder and decoder of the present invention contain 
a library of encoding methods that are treated as a 
35 "toolbox." The toolbox allows the encoder to selectively 

apply particular encoding methods or tools that optimize the 
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displayed picture by selectively adding detail and correcting 
pixel values. 

Like the encoding process, the decoding of an image can 
be viewed as a multi-step process that uses a variety of 
decoding methods which include inverse mathematical 
transformations, inverse quantization techniques, etc. 
Conventional decoders are designed to have an inverse 
function relative to the encoding system. These inverse 
decoding methods must match the encoding process used to 
encode the image. In addition, where an encoder makes 
content-sensitive adaptations to the compression algorithm, 
the decoder must apply a matching content -sensitive decoding 
process . 

Generally, a decoder is designed to match a specific 
encoding process. Prior art compression systems exist that 
allow the decoder to adjust particular parameters, but the 
prior art encoders must also transmit accompanying tables and 
other information. In addition, many conventional decoders 
are limited to specific decoding methods that do not 
20 accommodate content -sensitive adaptations. 

Qymmar-v of Invention 

The problems outlined above are solved by the method and 
apparatus of the present invention. That is, the computer- 
based image compression system of the present invention 
includes a unique encoder which compresses images and a 
unique decoder which decompresses images . - The unique 
compression system obtains high compression ratios at all 
image quality levels while achieving relatively quick 
encoding and decoding times. 

A high compression ratio enables faster image 
transmission and reduces the amount of storage space required 
to store an image. When compared with conventional 
compression techniques, such as the Joint Photographic 
Experts Group (JPEG) , the present invention significantly 
3 5 increases the compression ratio for color images which, when 

decompressed, are of comparable quality to the JPEG images. 
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comparative efficiency. These compression methods can be 
selectively applied to optimize an encoder with respect to a 
certain type of image. In addition to selectively applying 
various compression methods, it is also possible to optimize 
5 an encoder by varying the parameters (e.g., quantization 

tables) of a particular compression method. 

Broadly speaking, however, the prior art does not 
provide an adaptive encoder that automatically decomposes a 
source image, classifies its parts, and selects the optimal 

10 compression methods and the optimal parameters of the 

selected compression methods resulting in an optimized 
encoder that increases relative compression rates. 

Once an image is optimally compressed with an encoder, 
the set of compressed data are stored in a file. The 

15 structure of the compressed file is referred to as the file 

format. The file format can be fairly simple and common, or 
the format can be quite complex and include a particular 
sequence of compressed data or various types of control 
instructions and codes . 

20 The file format (the structure of the data in the file) 

is especially important when compressed data in the file will 
be read and processed sequentially and when the user desires 
to view or transmit only part of a compressed image file. 
Accordingly, it would be advantageous to provide a file 

25 format that "layers" the compressed image components, 

arranging those of greatest visual importance first, those of 
secondary visual importance second, and so on. Layering the 
compressed file format in such a way allows the first segment 
of the compressed image file to be decoded prior to the 

3 0 remainder of the file being received or read by the decoder. 

The decoder can display the first segment (layer) as a 
miniature version of the entire image or can enlarge the 
miniature to display a coarse or "splash" quality rendition 
of the original image. As each successive file segment or 

35 layer is received, the decoder enhances the quality of the 
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image. Thus, compression speeds the transmission of image 
files by reducing their size. 

Several processes have been developed for compressing 
the data required to represent an image. Generally, the 
processes rely on two methods: 1) spatial or time domain 
compression, and 2) frequency domain compression. In 
frequency domain compression, the binary data representing 
each pixel in the space or time domain are mapped into a new 
coordinate system in the frequency domain. 

in general, the mathematical transforms, such as the 
discrete cosine transform (DOT), are chosen so that the 
signal energy of the original image is preserved, but the 
energy is concentrated in a relatively few transform 
coefficients. Once transformed, the data is compressed by 
quantization and encoding of the transform coefficients. 

Optimization of the process of compressing an image 
includes increasing the compression ratio while maintaining 
the quality of the original image, reducing the time to 
encode an image, and reducing the time to decode a compressed 
20 image. In general, a process that increases the compression 

ratio or decreases the time to compress an image results in 
a loss of image quality. A process that increases the 
compression ratio and maintains a high quality image often 
results in longer encoding and decoding times. Accordingly, 
it would be advantageous to increase the compression ratio 
and reduce the time needed to encode and decode an image 
while maintaining a high quality image. 

It is well known that image encoders can be optimized 
for specific image types. For example, different types of 
images may include graphical, photographic, or typographic 
information or combinations thereof. As discussed in more 
detail below, the encoding of an image can be viewed as a 
multi-step process that uses a variety of compression methods 
which include filters, mathematical transformations, 
quantization techniques, etc. In general each compression 
method will compress different image types with varying 
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allowing the encoding of 16.8 million (2 8 x 2 8 x 2 9 ) different 
colors . 

Consequently, color images require large amounts of 
storage capacity. For example, a typical color (24 bits per 
5 pixel) image with a resolution of 640 by 480 requires 

approximately 922,000 bytes of storage. A larger 24 -bit 
color image with a 2000 by 2000 pixel resolution requires 
approximately twelve million bytes of storage. As a result, 
image -based applications such as interactive shopping, 

10 multimedia products, electronic games and other image-based 

presentations require large amounts of storage space to 
display high quality color images. 

In order to reduce storage requirements, an image is 
compressed (encoded) and stored as a smaller file which 

15 requires less storage space. In order to retrieve and view 

the compressed image, the compressed image file is expanded 
(decoded) to its original size. The decoded (or 

"reconstructed") image is usually an imperfect or "lossy" 
representation of the original image because some information 

20 may be lost in the compression process. Normally, the 

greater the amount of compression the greater the divergence 
between the original image and the reconstructed image. The 
amount of compression is often referred to as the compression 
ratio. The compression ratio is the amount of storage space 

25 needed to store the original (uncompressed) digitized image 

file divided by the amount of storage space needed to store 
the corresponding compressed image file. 

By reducing the amount of storage space needed to store 
an image, compression is also used to reduce the time needed 

30 to transfer and communicate images to other locations. In 

order to transfer an image, the data bits that represent the 
image are sent via a data channel to another location. The 
sequence of transmitted bytes is called the data stream. 
Generally, the image data is encoded and the compressed image 
3 5 data stream is sent over a data channel and when received, 

the compressed image data is decoded to recreate the original 
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METHOD AND APPARATUS FOR COMPRESSING IMAGES 

pa^ g-round nf t-.he Invention 

of ♦;>■"=> invention 
This invention relates to the compression and 
decompression of digital data and, more particularly, to the 
reduction in the amount of digital data necessary to store 
and transmit images. 
Ba ckground of th * Tnvention 

Image compression systems are commonly used in computers 
to reduce the storage space and transmittal times associated 
with storing, transferring and retrieving images. Due to 
increased use of images in computer applications, and the 
increase in the transfer of images, a variety of image 
compression techniques have attempted to solve the problems 
associated with the large amounts of storage space (i.e., 
hard disks, tapes or other devices) needed to store images. 

Conventional devices store an image as a two-dimensional 
array of picture elements, or pixels. The number of pixels 
determines the resolution of an image. Typically the 
resolution is measured by stating the number of horizontal 
and vertical pixels contained in the two dimensional image 
array. For example, a 640 by 480 image has 640 pixels across 
and 480 from top to bottom to total 307,200 pixels. 

While the number of pixels represents the image 
resolution, the number of bits assigned to each pixel 
represents the number of available intensity levels of each 
pixel. For example, if a pixel is only assigned one bit, the 
pixel can represent a maximum of two values. Thus the range 
of colors which can be assigned to that pixel is limited to 
two (typically black and white) . In color images, the bits 
assigned to each pixel represent the intensity values of the 
three primary colors of red, green and blue. In present 
-true color" applications, each pixel is normally represented 
by 24 bits where 8 bits are assigned to each primary color 
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WHAT IS CLAIMED IS : 

1. An encoder that compresses digital images, 
comprising: 

a first data compressor that receives an input 
digital image comprising a plurality of pixels, each 
pixel having at least one component representing an 
intensity level, said pixels of said input digital image 
being represented by a first plurality of data bytes, 
said first data compressor outputting decimated data 
bytes, said decimated data bytes comprising a second 
plurality of data bytes, said second plurality of data 
bytes being less in number than said first plurality of 
data bytes, said second plurality of data bytes 
representing said plurality of pixels; 

a second data compressor that receives data 
corresponding to said second plurality of data bytes, 
said second data compressor outputting further decimated 
data bytes, said further decimated data bytes comprising 
a third plurality of data bytes that represent a 
compressed coarse quality digital image; 

a residual calculator that receives said first 
plurality of data bytes and said second plurality of 
data bytes, said residual calculator expanding said 
third plurality of data bytes to a fourth plurality of 
data bytes having a same number of data bytes as said 
second plurality bytes, said residual calculating a 
difference between said second plurality of data bytes 
and said fourth plurality of data bytes, said difference 
comprising a plurality of residual data bytes; and 

a compressor that compresses said plurality of 
residual data bytes to generate compressed residual data 
bytes, said encoder outputting said third plurality of 
data bytes and said compressed residual- data bytes for 
storage as a compressed image, said third plurality of 
data bytes representing a coarse quality digital image, 
said compressed residual data bytes expandable to said 
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plurality of residual data bytes and combinable with 
said third plurality of data bytes to convert said third 
plurality of data bytes to said second plurality of data 
bytes representing a higher quality image. 

2. A decoder that receives compressed data input 
stream representing a digital image, said decoder outputting 
expanded data representing a reconstructed digital image, 
said decoder comprising: 

a decompressor that receives a first portion of 
said compressed data input stream, said decompressor 
expanding said first portion of said compressed data 
stream to first expanded data and outputting said first 
expanded data as a first output data stream to be 
displayed as a coarse quality digital image; 

a second compressor that receives a second portion 
of said compressed data input stream, said decompressor 
expanding said second portion of said data input stream 
to second expanded data; and 

an adder that combines said second expanded data 
with said first expanded data to generate a second 
output data stream to be displayed as a higher quality 
digital image, said first output data stream and said 
second output data stream independently selectable for 
display as a digital image. 

3 . A system that compresses and decompresses data 
representing a digital image so that different quality levels 
of said digital image can be selectably displayed on a video 
monitor, said system comprising: 

an encoder that compresses said data into 
compressed data layers that comprise data having a 
plurality of data formats, wherein at least a first one 
of said data layers comprises data representing a low 
quality image and at least a second one of said data 
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layers comprises data to convert said low quality image 
to a higher quality image; and 

a decoder that receives said compressed data 
layers, said decoder expanding said first one of said 
data layers to generate first expanded output data 
representing a low quality image, said decoder expanding 
said second one of said data layers to generate second 
expanded output data, said decoder combining said second 
expanded output data with said first * expanded output 
data to generate output data representing a higher 
quality image. 



4. A method of producing image information 
indicative of an original image to be sent over a channel in a 
way to receive the image information in progressively-rendered 

stages, comprising: 

forming a first compressed version of the image, 
the first compressed version being compressed relative to the 
original image by a first compression technique, and the first 
compressed version being of a smaller overall file size than the 
original image, the first compressed version including sufficient 
information such that, when displayed, a reduced-resolution 
version of the original image can be seen; and 

forming a second compressed version of the image, 
said second compressed version including additional information 
about the image beyond that information produced by said first 
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compressed version to produce a second image which has further 
resolution than said reduced resolution version, said second 
compressed version formed using a second compression technique 
which is different than said first compression technique; and 

producing an output file indicative of said first 
compressed version and said second compressed version. 

5, A method as in claim 4, wherein said forming a 
second compressed version includes analyzing at least a portion 
of information indicative of the image to determine an optimal 
compression scheme which will optimally compress said information 
from among a plurality of different compression schemes; and 

forming said second compressed version using said 
optimal compression technique as part of said second compression 
technique. 

6, A method as in claim 5, further comprising 
dividing said information into blocks, and classifying each block 
of the image according to a particular one of said plurality of 
compression schemes that optimize an amount of compression for 
each said block. 

7, A method as in claim 4, further comprising 
producing a third compressed version of the image, said third 
compressed version comprising information providing a highest 
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quality version of the image and including additional information 
beyond that information producing by said first and second 
composed versions. 

8. A method as in claim 4, further comprising: 
initially selecting a number of stages in which 

said image will be compressed; 

and further comprising: 

compressing said image in said number of stages, 
said first and second compressed versions being the first two 
stages of said number of stages. 

9. A method as in claim 4, wherein said forming a 
first compressed version comprises obtaining a thumbnail image of 
the original image, said thumbnail being a version of the image 
that is sized and intended to be displayed in a smaller scale 
than the original image, over a smaller of pixels than are 
contained in the original image; and 

interpolating said thumbnail image into a full 
sized image as said first compressed version. 

10. A method as in claim 9, further comprising fitting 
said thumbnail image to a function which increases its accuracy. 
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11. A method as in claim 9, further comprising 
processing information indicative of said thumbnail image to 
determine which of a plurality of compressing schemes will 
optimize a compression ratio for said* second compressed version, 
and said forming a second compressed version comprises 
compressing said information using the compression scheme which 
will best compress said image. 

12. A method as in claim 11, wherein said plurality of 
compression schemes include vector quantization, discrete cosine 
transform, pulse code modulation, and run length encoding. 

13. A method as in claim 4, wherein said forming a 
first compressed version comprises decimating the original image 
by a predetermined factor along particular dimensions. 

14. A method as in claim 13, wherein said decimating 
comprises decimating each of the color components of the image by 
a factor of 2 along vertical and horizontal dimensions. 

15. An image compression apparatus, comprising: 

a first element operating to receive a source 
image to be compressed; 

a formatting element which changes said source 
image into a form which is susceptible of being processed; 
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a first compression device., including a decimating 
element which decimates and compresses said source image using a 
first compression technique that produces information indicative 
of a reduced quality image; 

a second image compressing device, which provides 
second image information about the source n addition to that 
contained in said first image information, said second image 
compressing device compressing using a second compression 
technique which is different than said first compression 
technique; and 

a message assembling element, assembling said 
first image information into a first part of a message to be 
transmitted and said second image information into a second part 
of the message, said first and second parts of the message being 
separately readable. > 

16, An apparatus as in claim 15, further comprising an 
image-classifying element which determines a most efficient 
compression technique to compress information indicative of said 
source image among a plurality of compression techniques and said 
second image compressing device compresses said image to form 
said second image information using said most efficient 
technique. 
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17. An apparatus as in claim 15, wherein said first 
image information is a decimated and re-interpolated image. 

18. A method of transferring a progressively-rendered, 
compressed image, over a finite bandwidth channel, comprising: 

producing a coarse quality compressed image at a 
source and transmitting said coarse quality compressed image over 
a channel as a first part of a transmission to a destination end; 

receiving the coarse quality compressed image at a 
receiver at the destination end at a first time and displaying an 
image based on said coarse quality compressed image on a display 
system of the receiver when received at said first time; 

creating additional information about the image, 
at the source end, from which a standard quality image can be 
displayed, said standard quality image being of a higher quality 
than said coarse quality image, and sending compressed 
information over said channel indicative of information for said 
standard quality image, said sending said standard quality image 
information occurring subsequent in time to said sending of all 
of said information for said coarse quality image; 

receiving said standard quality image information 
at the receives at a second time, subsequent to the first time, 
and decompressing said standard quality image information, to 
improve the quality of the image displayed on said display 
system, and to display said standard quality image; 
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obtaining further information about the image 
beyond the information in said standard quality image, to provide 
an enhanced quality image, and compressing said information for 
said enhanced quality image, said enhanced quality image having 
more image details than said standard quality image; 

transmitting said information for said enhanced 
quality image, at a time subsequent to transmitting said 
information for said coarse quality image and said standard 
quality image; and 

receiving said enhanced quality image information 
at said receiver, at a third time subsequent to said first and 
second times, and updating a display on said display system to 
display the additional enhanced quality image. 

19. A method as in claim 18, wherein said producing 
the coarse quality image uses a different compression technique 
than said creating additional information indicative of the 
standard quality image. 

20. A method as in claim 18, wherein said coarse 
quality image includes information indicative of a miniature 
version of an original image, and said displaying the coarse 
quality image comprises interpolating said miniature to a size of 
the original image and displaying said image. 
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21 • A method as in claim 19, wherein said creating 



additional information comprises determining a characteristic of 
the image, determining which of a plurality of different 
compression techniques will best compress the characteristic 
determined; and compressing said image using the determined 
technique. 



determining a plurality of areas in said image, and determining, 
for each area, which of the plurality of different compression 
techniques will optimize the compression ratio. 

23. A method as in claim 22, further comprising 
interleaving and channel encoding different portions of the 
compressed image. 

24. A method as in claim 22, wherein said compression 
techniques include vector quantization and discrete cosine 
transform. 



22. 



A method as in claim 21, further comprising 



25. 



A method as in claim 20, wherein said obtaining a 



miniature comprises decimating along vertical and horizontal 



axes. 
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26. A layered progressively-compressed image 
compression system, comprising: 

a first image compression element obtaining a 
source image to be compressed and compressing said source image 
using a first image compression scheme to produce a first image 
layer; 

a second image compression element, said second 
image compression element compressing information indicative of 
said source image using a different compression technique than 
said first image compression element to produce a second image 
layer; and 

an output message assembling element, said output 
message assembling element receiving said first and second image 
layers from said first and second image compressing elements, 
respectively, a first "image layer stored in a first area which 
will be output first, said first image layer including 
information from which a coarse image can be reconstructed; and 
said second image layer including information from which a finer 
image, having more detail than said coarse image, can be 
reconstructed, and said second layer being stored in a location 
where it will be transmitted after said first layer is 
transmitted, 

27. A system as in claim 26, wherein said first layer 
indicative of a coarse image is produced by obtaining a thumbnail 
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miniature of the original information by decimating the source 
image, and interpolating said thumbnail miniature to a size of a 
full image display. 

28, A system as in claim 26, wherein there are a 
plurality of layers, each layer including a complete set of 
information to be displayed at a decoding end, each layer 
progressively including more information than a previous layer. 

29. A method of transmitting and displaying a 
compressed image comprising: 

first obtaining and sending a first layer of 
information indicative of a compressed miniature image at a first 
time; 

first receiving said first layer at said decoder 
end and decompressing and displaying a first coarse image 
indicative thereof; 

second obtaining and sending information 
indicative of a compressed improved resolution image having more 
details than said first coarse image, and transmitting said 
information at a second time subsequent to said first time; and 

second receiving and decompressing said improved 
resolution image information to provide an updated display which 
improves the resolution of said first coarse image. 
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30. A method as in claim 29, wherein said obtaining 
coarse information comprises: 

transmitting information indicative of a 
compressed miniature of the image; 

receiving the compressed miniature of the image; 

interpolating the compressed miniature of the 
image into a full sized image; and 

displaying the full sized image. 

31. A method as in claim 30, wherein the first coarse 
image is compressed using a first compression technique and the 
second image is compressed using a second compression technique 
which is different from the first compression technique. 

32. A method as in claim 31, further comprising 
determining which of a plurality of different image compression 
techniques will most efficiently code information indicative of 
said image. 

33. A method as in claim 32, wherein said determining 
uses fuzzy logic techniques. 

34. Image encoding system, comprising: 
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a first element, operating to receive a source 
image and to format the source image in a way to allow its 
coding; 

a first compression coder, which filters the 
formatted source image, to form a first compressed and coarse 
quality image; 

an image classifier, operating to classify the 
information contained in the image according to a characteristic 
thereof that is related to an amount by which the image can be 
compressed; 

a compression encoder, determining one of a 
plurality of compression methods that will optimize the amount of 
compression based on a result of said image classifier, and 
encoding said information using the optimized compression method 
to produce a second compressed image; 

a message assembling element interleaving 
information indicative of the first image and the second image 
into a desired form in message transmitting format, and 
transmitting said message to a channel. 

35. A system as in claim 34, wherein said first 
element includes a decimating stage, and said compression encoder 
uses a different kind of compression than said decimating. 

36. An image decoder system, comprising: 
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a first element, connected to a transmission 
channel to receive transmitted, compressed data indicative of an 
image therefrom, said compressed data received in layers; 

a display interface which receives information to 

be displayed; 

a first layer detector and decompression element, 
detecting a complete first layer, and decompressing said first 
layer when complete, to produce first information indicative of a 
reduced quality image, based on said first layer after decoding 
said first layer using a decompression technique and sending said 
first information to said display interface; and 

a second layer detector and decompression element, 
receiving a second layer of image information, compressed using a 
different compression technique than said first layer, and 
detecting that at least a unit of said second layer has been 
completely received, and decompressing said second layer to 
produce additional information which is coupled to said display 
interface to improve a displayed image resolution. 

37. A system as in claim 36, further comprising a 
third layer detector, receiving and decompressing a third layer 
of information, forming a final display. 

- 101 - 



I: <WO 9602895A1 JA> 



WO 96/02895 



PCT/US95/0$827 



38. A system as in claim 37, wherein said units of 
said second layer are display panels, each panel displayed when 
completed, to form the second layer of the image in panels. 

39. A method as in claim 31, wherein said first 
obtaining comprises decimating data on the image to form a 
reduced quality image, fitting the decimated data to a first 
model which partially restores source image detail lost by 
decimation, and calculating reconstruction values from the 
fitting. 

40. A method as in claim 39, further comprising using 
said reconstruction weights to interpolate the decimated data 
into a full sized image while minimizing a mean squared error 
between original image components and interpolated image 
components . 

41. A method as in claim 31, wherein said first step 
comprises forming miniature versions of the original source image 
for each of a plurality of primary colors. 

42. A system as in claim 36, further comprising an 
image classifying module, that determines a characteristic of the 
image indicative of a best technique of compression, to output' a 
measure indicative of said best technique. 
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43. A system as in claim 42, wherein said image 
classifying module uses fuzzy logic techniques. 

44. A system as in claim 42, wherein said image types 
include gray scale, graphics, text, photographs, high activity 
and low activity images. 

45. A method as in claim 29, wherein said first 
obtaining comprises obtaining a miniature image, and further 
comprising analyzing the miniature image to classify the image 
into one of a plurality of classes indicative of which of a 
plurality of compression techniques will best compress said 
image. 

46. A method of encoding a source image, comprising: 
obtaining a first compressed version of the image, 

said first compressed version of the image corresponding to a 
coarse version of the image indicative of coarse details only, 
said first compressed version of the image obtained using a first 
compression technique; 

analyzing said coarse version of the image to 
determine which of a plurality of different compression 
techniques will best further compress said image; and 

further compressing said image to obtain further 
information indicative of a better rendering of said image, than 
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said coarse version of said image, using the compression 
technique determined by said analyzing. 

47. A method as in claim 46, wherein said first 
compression technique includes decimation of image components 
followed by interpolation. 



48. A method as in claim 46, further comprising: 
dividing information indicative of the image into 

a plurality of block units; 

classifying each block unit according to a 
characteristic which .will most efficiently compress said each 
block unit; and 

outputting a control script that specifies an 
optimized . compression^method for each said block unit. 

49. A method as in claim 48, further comprising 
compressing, in a third stage, according to the control script. 

50. A method as in claim 49, further comprising 
channel encoding according to the control script. 

51. A method as in claim 46, further comprising 
evaluating information indicative of the coarse image, and 
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determining discrete cosine transform coefficients of said 
information. 

52, A method as in claim 51, further comprising 
obtaining a reconstructed coarse image from the discrete cosine 
transform coefficients, determining a residual between the 
reconstructed image and the coarse image and compressing the 
residual . 

53. An image encoding device, comprising: 

a first stage, operating to produce first 
information indicating a reduced quality compressed version of 
the original .image; 

a second stage which analyzes the first 
information to determine an image classification thereof, and 
outputs a control script indicative of an efficient compression 
method based on said image classification; 

a third stage, responsive to said control script, 
to select a compression method from among a plurality of 
compression methods based on said control script and compressing 
image information using said compression method to produce second 
information; and 

a fourth stage which assembles the first 
information and the second information into a message to be sent. 
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54. A method as in claim 53, wherein said second stage 
comprises a discrete cosine transform device, operating to obtain 
discrete cosine transform coefficients indicative of the image 
and to determine quantization step sizes therefrom. 

55. A device as in claim 53 , wherein said first stage 
comprises a component separator, separating chrominance 
components from luminance components; wherein said first stage 
decimates said chrominance components; and said third stage 
compresses said luminance components using a discrete cosine 
transform technique. 

56. A device as in claim 55, wherein said second stage 
comprises a discrete cosine transform coefficient determining 
device, determining optimal quantization step sizes, and 
quantizing discrete cosine transform coefficients using said 
optimal step sizes. 

57. A device as in claim 56, further comprising a 
dequantizer for dequantizing the discrete cosine transform 
quantized values to determine an error between the dequantized 
values and the original image to form a residual, and a fifth 
stage operating for compressing said residual. 
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58. A device as in claim 57 wherein said fifth stage 
comprises an adaptive vector quantizer operating to compress the 
residual by matching the residual against a group of commonly- 
occurring block patterns in a codebook. 

59. A device as in claim 55, wherein said chrominance 
is compressed by decimating the color, and fitting the decimated 
data to a spline function to determine optimal reconstruction 
weights to minimize a mean squared error. 

60. An apparatus as in claim 15, wherein said first 
and second parts of the message are totally separate. 

61- A method of encoding an image, comprising: 

processing the image according to a first 
technique to produce a processed image; 

separating color components of the processed image 
from intensity components of the processed image; 

compressing said color components of the image by 
compressing using a color component compression technique; and 

further compressing the intensity components of 
the image using a intensity component compression technique 
different than the color component compression technique. 

( 

- 107 - 



: <WO 9602695A1JA> 



WO 96/02895 



PCT/US95/08.827' 



62. A method as in claim 61 wherein said first 
technique is a compression technique which includes a decimation 
technique, said color component compression technique includes a 
discrete cosine transform compression technique and said 
intensity component compression technique includes a differential 
pulse code modulation technique. 



63. A method as in claim 61 wherein said first 
technique includes a decimation technique followed by a technique 
of reconstructing information from the decimated data obtained 
from the decimation technique. 

64. A method as in claim 61 further comprising 
determining optimal compression techniques, and producing a 
control script indicative thereof, at least one of said 
compression techniques being chosen based on said control script. 

65. A method as in claim 61 wherein said color 
component compression technique is an optimized discrete cosine 
transform. 



66. A method as in claim 64 wherein said color 
component compression technique is an optimized discrete cosine 
transform. 
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67. A method as in claim 66 further comprising 
determining optimal quantization step sizes based on said control 
script. 

68. A method as in claim 66 further comprising reverse 
discrete cosine transforming information obtained by said color 
component compression technique using the discrete cosine 
transform-compressed signal, and determining a difference between 
the reverse-discrete-cosine transformed signal and the original 
signal to determine an error signal there between. 

69. A method as in claim 68 further comprising 
comparing said error to a codebook, and choosing a codebook entry 
which matches most closely with said error. 

70. A method as in claim 61 wherein said further 
compression is by a differential pulse code modulation. 

71. A compression device, comprising: 

a first element receiving an image to be 

compressed; 

a second element carrying out an initial 
compression on said image received by said first element to 
produce an output initially-compressed image indicative thereof; 
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a third element which separates said output 
initially-compressed image into intensity components and color 
components ; 

a fourth element which compresses said color 
components using a first compression technique; and 

a fifth element which compresses said intensity 
components using a second compression technique, different than 
said first compression technique. 

72. An encoding apparatus as in claim 71 wherein said 
first compression technique is a discrete cosine transform 
technique and said fifth compression technique is a differential 
PCM technique. 

73. A system as in claim 72 wherein said second 
element is a decimating and curve fitting compressor. 

74. A method of compressing data, comprising: 
first compressing said data using a discrete 

cosine transform technique to produce an output signal indicative 
of a discrete cosine transform-compressed data; 

reviewing said discrete cosine transform- 
compressed data, and re-converting said discrete cosine 
transform-compressed data to reconstructed data of the same form 
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as the starting data, and determining differences between said 
starting data and said reconstructed data; 

comparing said differences to a plurality of 
quantized differences from a codebook and choosing a closest 
match; and 

forming an output message that includes 
coefficients of said discrete cosine transform and an index 
associated with said codebook, as compressed data indicative of 
the data. 

75. A method as in claim 74 wherein said data to be 
compressed is an original image which has been pre-compressed 
using a technique which is different than said discrete cosine 
transform technique and said codebook technique. 

76. A method as in claim 74 wherein said first 
technique is a decimate and curve fitting technique. 

77. A method of selectively coding an image, 
comprising : 

dividing said image into a plurality of areas, 
each area representing a portion of the image; 

comparing each said area with a value indicating 
whether said area should or should not be rendered in an enhanced 
mode ; 
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adding a prioritized value to an enhancement list 
for each of said areas that will be rendered in the enhanced 
mode ; 

compressing values which are on said enhancement 
list using a high resolution compression technique; and 

compressing values which are not on said 
enhancement list using a different compression technique. 

78. A method as in claim 77 wherein said high 
resolution compression technique is a high resolution residual 
calculator. 

79. A method as in claim 74 wherein said areas are 
blocks of a pre-compressed image. 

80. A method as in claim 74 further comprising 
compressing said image using a discrete cosine transform, wherein 
said high resolution compression technique is a high resolution 
residual calculator, and said different compression technique 
less high resolution residual calculators. 



81. An element as in claim (control script) further 
comprising a channel encoder, obtaining a plurality of data 
segments each of which indicates data from one of said 



- 112 - 



BNSDOCID: <WO 9602895A1_1A> 



WO 96/02895 



PCT/US95/08827 



compression techniques, said encoding being done in accordance 
with the control script. 

82- An image compression system, comprising: 

a first compression element which pre-compresses 

an image to produce a pre-compressed image; 

a color converter device which separates said 

first pre-compressed image into intensity components and color 

components ; 

an image classifier, which classifies said image 
to determine at least one compression technique which will most 
efficiently compress said image, and produces a control script 
indicative thereof; 

a compression element, including elements for 
operating according to --one of a plurality of different 
compression techniques, receiving said control script and 
compressing based on one of said plurality of compression 
techniques based on said control script; and 

a channel coder which interleaves and channel- 
codes information from said compression element, said channel 
coder being responsive to said control script, and channel coding 
in accordance therewith. 
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83. A system as in claim 82 wherein said channel coder 
includes a plurality of different kinds of encoding techniques, 
which are selected by said control script. 

84. A system as in claim 82 wherein said compressing 
element includes a discrete cosine transform compressing element 
and a differential pulse code modulation compressing element. 

85. An adaptive vector quantizing compression device, 
comprising: 

a first element for receiving data to be 

processed; 

an image subdivider, operating to subdivide the 
image data into a set of predetermined length of pixel blocks; 

a codebook, which includes a plurality of vectors 
that correspond to common patterns found in the population of 
data ; 

a processing element, operating to determine a 
best match between said image element and said codebook by 
determining a minimum squared error summed over all elements in 
the block and to produce an index indicative thereof; and 

transmitting the codebook value in place of the 
original image data. 
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86. A compressed file format, comprising: 

a header segment which describes overhead information about 
the system and an object being compressed; 

a plurality of segments of image information, each segment 
separate from each other segment, and each segment including 
separate information therefrom; 

each of said separate information being separately 
displayable information, and at least one segment of said 
information including an initial low resolution image. 

87. A data format as in claim 86, wherein said header 
segment includes information about a following data stream, 
including an indication of whether the data stream includes image 
information or includes resource information. 

88. The system as in claim 87, wherein said resource 
information includes look-up tables and vector quantization 
tables. 

89. The system as in claim 86, wherein said header include 
information indicative of a version of coding used for the image 
information. 

90. A decoder system which decodes a compressed file into 
an uncompressed file, comprising: 

an element which receives the compressed file including a 

plurality of panels; 
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a decompression element which serially expands each panel to 
produce file information therefrom; and 

a file memory, storing information indicative of the file, 
and receiving more information from each panel to provide 
progressively more file information than that present in a 
previous panel. 

91. A decoder as in claim 90, wherein said file is an image 
and a first panel of image data is decompressed to provide a 
coarse representation of the image. 

92. A decoder as in claim 91, wherein said first, coarse 
layer of the image comprises a miniature version of the image, 
having a smaller size than an original image, and which is 
interpolated to a full size image. 

93. A decoder as in claim 92, comprising a first decoding 
element which decompresses at least one panel comprising a 
thumbnail image, a second decoding element which decompresses at 
least one panel comprising a splash image, a third decoding 
element which decompresses at least one panel comprising 
information for a standard image and a fourth step which 
decompresses at least one panel to provide information to provide 
a high detail image. 

94. A decoder as in claim 92, wherein said interpolation 
includes an interpolator, controlled according to an 
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interpolation factor, said interpolation factor controlling an 
amount of interpolation. 

95. A decoder as in claim 91, further comprising an element 
for decompressing a discrete cosine transform ("DCT" ) data 
segment . 

96. A decoder as in claim 95, further comprising an element 
for decompressing a vector quantitized residual from the DCT data 
segment. 

97. A method of compressing an image, comprising: 
decomposing an initial image to be compressed into a 

plurality of sub-images, each sub- image having a content which is 
homogeneous in content of a particular feature; 

analyzing each of said sub-images and determining which of 
said sub- images are visually important; 

optimizing compression methods for each of said sub-images 
in a way such that visually important sub-images have more 
information associated therewith. 

98. A method as in claim 97, wherein one of said 
compression methods is a discrete cosine transform ("DCT") and 
said optimizing includes setting a control script indicating a 
quantization step size of said DCT. 
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99. A method as in claim 97, wherein said compression 
technique is a vector quantization, and said optimizing includes 
setting a feature of a codebook used in said vector quantization. 

100. A method as in claim 97, wherein said classification 
uses fuzzy logic to determine, among a plurality of classes, 
whether said image content is more like one class or more like 
another class, 

101. A method as in claim 100, wherein said compression 
further comprises mapping input sets to corresponding output 
sets, said output sets indicating which of a plurality of 
compression methods to apply, and blending said output steps to 
provide a control script. 

102. A method of classifying an image, comprising: 
dividing said 'image into a plurality of sub-images, each 

said sub-image having a characteristic which is uniform within 
the subimage by at least a predetermined amount; 

carrying out a first kind of image compression on the 
subimage; 

obtaining a combinational overview on the results of the 
first kind of image compression to determine a profile of the 
image component; and 

comparing said combinational overview with a plurality 
of rules, using a plurality of fuzzy input sets having an input 
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rule base, said input sets indicating a plurality of image types, 
to determine a type of said sub-image. 

103. A method as in claim 102, wherein said first image 
compression is a DCT compression, and said combinational overview 
is a combination of each DCT component, histogrammed to provide a 
frequency domain profile. 

104. A method as in claim 103, further comprising 
determining of plurality of spacial domain blocks, and matching 
the spacial domain blocks with a special pattern list. 

105. A method of identifying a optimal compression technique 
for a portion of an image, comprising: 

determining a histogram of coefficients of at least a first 
compression technique ; and 

matching said histogram, using fuzzy logic, to a closest 
match to determine an ideal compression technique; and 

determining components of said image; . 

106. A method of determining enhancement values for an 
image , compr i s ing ; 

dividing said image into a plurality of sub-images, each 
said sub-image having a predetermined characteristic; 

testing a parameter of each said sub-image against a 
threshold value, said threshold value being one which indicates 
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that said sub-image has a lot of changes from a previous sub- 
image ; and 

determining those parts of the image which compare with said 
normalized threshold as being enhanced portion images. 

107. Method as in claim 106, wherein said value is values of 
color components, said color components being compared against a 
normalized threshold value for said color components. 

108. Method as in claim 107, further comprising compressing 
said image using a discrete cosine transform technique and 
analyzing coefficients of the discrete cosine transform to 
determine regions where the compression ratio can be adjusted 
without effecting its quality. 
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